[PATCH, i386]: Fix i386.md:5807: warning: source missing a mode?

2011-07-28 Thread Uros Bizjak
Hello!

2011-07-28  Uros Bizjak  

* config/i386/i386.c (add->lea splitter): Add SWI mode to PLUS RTX.

Tested on x86_64-pc-linux-gnu, committed to mainline.

Uros.

Index: i386.md
===
--- i386.md (revision 176858)
+++ i386.md (working copy)
@@ -5806,8 +5806,8 @@
 ;; Convert add to the lea pattern to avoid flags dependency.
 (define_split
   [(set (match_operand:SWI 0 "register_operand" "")
-   (plus (match_operand:SWI 1 "register_operand" "")
-  (match_operand:SWI 2 "" "")))
+   (plus:SWI (match_operand:SWI 1 "register_operand" "")
+ (match_operand:SWI 2 "" "")))
(clobber (reg:CC FLAGS_REG))]
   "reload_completed && ix86_lea_for_add_ok (insn, operands)"
   [(const_int 0)]


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Alan Modra
On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
> Ideally what I'd like to
> do is have ld and gcc emit accurate r2 tracking unwind info and
> dispense with hacks like frob_update_context.  If ld did emit accurate
> unwind info for .glink, then the justification for frob_update_context
> disappears.

For the record, this statement of mine doesn't make sense.  A .glink
stub doesn't make a frame, so a backtrace won't normally pass through a
stub, thus having accurate unwind info for .glink doesn't help at all.

ld would need to insert unwind info for r2 on the call, but that
involves editing .eh_frame and in any case isn't accurate since
the r2 save doesn't happen until one or two instructions after the
call, in the stub.  I think we are stuck with frob_update_context.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Paolo Bonzini

On 07/28/2011 12:59 AM, H.J. Lu wrote:

>  Regarding correctness: you're converting a SImode operation to DImode by
>  "pushing in" the zero_extend operation.  What makes you think that base +
>  constant offset won't overflow in any case?


You have not answered this.


>  And also: what are you gaining by allowing the wrap around?  I don't need to
>  know what ignore_address_wrap_around does, I need to know _why_ it is
>  necessary.

We have

(zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))

I want to convert it to

(plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))

There is no zero-extend on (const_int Y).  if FOO == 0xfffc and Y = 8,

(zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))

gives 0x4 and

(plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))

gives 0x10004.


This was already clear upthread.

I'm asking what it buys you in real code.


If (plus:SI (FOO:SI) (const_int Y)) won't overflow
or its behavior is implementation-defined,


Behavior of plus:SI is never implementation defined, it is the extension 
that is done with an UNSPEC.  (In fact I'm not even sure the 
optimization is ok when done with POINTER_EXTEND_UNSIGNED < 0, but I'm 
not touching that for now).


Paolo


[PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Carrot Wei
Hi

In function combine.c:make_compound_operation, it tries to transforms the
expression
 (ashiftrt (ashift foo C1) C2) with C2 >= C1
into SIGN_EXTRACT.

It works pretty well in usual cases. But for the test case in PR49799, there is
an expression
 (X << (tmp-1)) >> 16
tmp is an uninitialized variable, only after init-regs pass, it is set to 0.
Then after several successful combine, it will see following expression

(ashiftrt:SI (ashift:SI (reg:SI 145 [ *K_2(D) ])
(const_int -1 [0x]))
(const_int 16 [0x10]))

and change it to an illegal bit field extraction instruction(sbfx).

Add a check to ensure the bit field is valid before applying the change, so the
wrong sbfx will not be generated.

Bootstrapped and regtested on x86_64-unknown-linux-gnu.
Regtested on arm qemu.

OK for trunk and 4.6?

thanks
Carrot


ChangeLog:
2011-07-28  Wei Guozhi  

PR rtl-optimization/49799
* combine.c (make_compound_operation): Check if the bit field is valid
before change it to bit field extraction.


Index: gcc/combine.c
===
--- gcc/combine.c   (revision 176733)
+++ gcc/combine.c   (working copy)
@@ -7787,6 +7787,7 @@ make_compound_operation (rtx x, enum rtx
  && GET_CODE (lhs) == ASHIFT
  && CONST_INT_P (XEXP (lhs, 1))
  && INTVAL (rhs) >= INTVAL (XEXP (lhs, 1))
+ && INTVAL (XEXP (lhs, 1)) >= 0
  && INTVAL (rhs) < mode_width)
{
  new_rtx = make_compound_operation (XEXP (lhs, 0), next_code);


Re: PATCH: PR target/47715: [x32] TLS doesn't work

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:52 AM, Uros Bizjak  wrote:

>> TLS on X32 is almost identical to TLS on x86-64.  The only difference is
>> x32 address space is 32bit.  That means TLS symbols can be in either
>> SImode or DImode with upper 32bit zero.  This patch updates
>> tls_global_dynamic_64 to support x32.  OK for trunk?

Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will
also work.  Please see attached patch.

Uros.
Index: i386.md
===
--- i386.md (revision 176860)
+++ i386.md (working copy)
@@ -12327,7 +12327,7 @@
(call:DI
 (mem:QI (match_operand:DI 2 "constant_call_address_operand" "z"))
 (match_operand:DI 3 "" "")))
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
  UNSPEC_TLS_GD)]
   "TARGET_64BIT"
 {
@@ -12349,7 +12349,7 @@
  (call:DI
   (mem:QI (match_operand:DI 2 "constant_call_address_operand" ""))
   (const_int 0)))
- (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+ (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
UNSPEC_TLS_GD)])])
 
 (define_insn "*tls_local_dynamic_base_32_gnu"
@@ -12553,7 +12553,7 @@
 
 (define_expand "tls_dynamic_gnu2_64"
   [(set (match_dup 2)
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
   UNSPEC_TLSDESC))
(parallel
 [(set (match_operand:DI 0 "register_operand" "")
@@ -12568,7 +12568,7 @@
 
 (define_insn "*tls_dynamic_lea_64"
   [(set (match_operand:DI 0 "register_operand" "=r")
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
   UNSPEC_TLSDESC))]
   "TARGET_64BIT && TARGET_GNU2_TLS"
   "lea{q}\t{%a1@TLSDESC(%%rip), %0|%0, %a1@TLSDESC[rip]}"
@@ -12579,7 +12579,7 @@
 
 (define_insn "*tls_dynamic_call_64"
   [(set (match_operand:DI 0 "register_operand" "=a")
-   (unspec:DI [(match_operand:DI 1 "tls_symbolic_operand" "")
+   (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")
(match_operand:DI 2 "register_operand" "0")
(reg:DI SP_REG)]
   UNSPEC_TLSDESC))
@@ -12598,7 +12598,7 @@
 (reg:DI SP_REG)]
UNSPEC_TLSDESC)
 (const:DI (unspec:DI
-   [(match_operand:DI 1 "tls_symbolic_operand" "")]
+   [(match_operand 1 "tls_symbolic_operand" "")]
UNSPEC_DTPOFF
(clobber (reg:CC FLAGS_REG))]
   "TARGET_64BIT && TARGET_GNU2_TLS"


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 5:11 AM, H.J. Lu  wrote:

> In x32, thread pointer is 32bit and choice of segment register for the
> thread base ptr load should be based on TARGET_64BIT.  This patch
> implements it.  OK for trunk?

-ENOTESTCASE.

Uros.


Re: [PATCH] Fix PR49876: Continue code generation with integer_zero_node on gloog_error

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 8:49 PM, Sebastian Pop  wrote:
> When setting gloog_error, graphite should continue code generation
> without early returns, as otherwise the SSA representation would not
> be complete.  So set the new expression to integer_zero_node, that
> would not require more SSA updates, and continue code generation as
> nothing happened.

I suppose you have to watch for correct types?  Or does the code get
discarded again before it eventually reaches the verifier?  Ok in that case.

Thanks,
Richard.

> Regstrapping on amd64-linux.
>
> 2011-07-27  Sebastian Pop  
>
>        PR tree-optimization/49876
>        * sese.c (rename_uses): Do not return false on gloog_error: set
>        the new_expr to integer_zero_node and continue code generation.
>        (graphite_copy_stmts_from_block): Remove early exit on gloog_error.
> ---
>  gcc/ChangeLog |    7 +++
>  gcc/sese.c    |   18 --
>  2 files changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index b07d494..a565c18 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,5 +1,12 @@
>  2011-07-27  Sebastian Pop  
>
> +       PR tree-optimization/49876
> +       * sese.c (rename_uses): Do not return false on gloog_error: set
> +       the new_expr to integer_zero_node and continue code generation.
> +       (graphite_copy_stmts_from_block): Remove early exit on gloog_error.
> +
> +2011-07-27  Sebastian Pop  
> +
>        PR tree-optimization/49471
>        * tree-ssa-loop-manip.c (canonicalize_loop_ivs): Build an unsigned
>        iv only when the largest type is unsigned.  Do not call
> diff --git a/gcc/sese.c b/gcc/sese.c
> index ec96dfb..04a8e75 100644
> --- a/gcc/sese.c
> +++ b/gcc/sese.c
> @@ -527,10 +527,10 @@ rename_uses (gimple copy, htab_t rename_map, 
> gimple_stmt_iterator *gsi_tgt,
>       if (chrec_contains_undetermined (scev))
>        {
>          *gloog_error = true;
> -         return false;
> +         new_expr = integer_zero_node;
>        }
> -
> -      new_expr = chrec_apply_map (scev, iv_map);
> +      else
> +       new_expr = chrec_apply_map (scev, iv_map);
>
>       /* The apply should produce an expression tree containing
>         the uses of the new induction variables.  We should be
> @@ -540,12 +540,13 @@ rename_uses (gimple copy, htab_t rename_map, 
> gimple_stmt_iterator *gsi_tgt,
>          || tree_contains_chrecs (new_expr, NULL))
>        {
>          *gloog_error = true;
> -         return false;
> +         new_expr = integer_zero_node;
>        }
> +      else
> +       /* Replace the old_name with the new_expr.  */
> +       new_expr = force_gimple_operand (unshare_expr (new_expr), &stmts,
> +                                        true, NULL_TREE);
>
> -      /* Replace the old_name with the new_expr.  */
> -      new_expr = force_gimple_operand (unshare_expr (new_expr), &stmts,
> -                                      true, NULL_TREE);
>       gsi_insert_seq_before (gsi_tgt, stmts, GSI_SAME_STMT);
>       replace_exp (use_p, new_expr);
>
> @@ -621,9 +622,6 @@ graphite_copy_stmts_from_block (basic_block bb, 
> basic_block new_bb,
>                       gloog_error))
>        fold_stmt_inplace (copy);
>
> -      if (*gloog_error)
> -       break;
> -
>       update_stmt (copy);
>     }
>  }
> --
> 1.7.4.1
>
>


Re: [PATCH, PR 49094] Refrain from creating misaligned accesses in SRA

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 9:23 PM, Ulrich Weigand  wrote:
> Martin Jambor wrote:
>> On Wed, Jul 27, 2011 at 02:34:59PM +0200, Ulrich Weigand wrote:
>> > Martin Jambor wrote:
>> >
>> > > OK, this is what I have just committed as revision 176797 after
>> > > re-testing.
>> >
>> > Thanks, this has fixed the forwprop-5.c regression on spu-elf on mainline.
>> >
>> > I'm seeing the same failure on the 4.6 branch -- would this patch also be
>> > appropriate there?
>> >
>>
>> You're right, it should be applied to the 4.6 branch too.  Since you
>> have the setup to thest it, can you do it please?  Otherwise I'll do
>> it in a few days.
>
> Full test on spu-elf has now completed.  In addition to the forwprop-5.c
> regression, the patch also fixes this regression (see PR 49545):
> FAIL: g++.dg/tree-ssa/fwprop-align.C scan-tree-dump-times forwprop2 "& 1" 0
>
> No new regressions.
>
> OK for the branch?

Ok.

Thanks,
Richard.

> Bye,
> Ulrich
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  ulrich.weig...@de.ibm.com
>


Re: [PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Jakub Jelinek
On Thu, Jul 28, 2011 at 03:38:07PM +0800, Carrot Wei wrote:
> OK for trunk and 4.6?
> 
> ChangeLog:
> 2011-07-28  Wei Guozhi  
> 
> PR rtl-optimization/49799
> * combine.c (make_compound_operation): Check if the bit field is valid
> before change it to bit field extraction.

Looks good to me, handling SHIFT_COUNT_TRUNCATED here isn't IMHO necessary
and the checking whether shift count is in the right range matches other rtx
simplifications (e.g. in simplify-rtx.c).
Though, you should add a testcase, probably
/* PR rtl-optimization/49799 */
/* { dg-do assemble } */
/* { dg-options "-O2 -w" } */

plus not sure if for arm you don't want to force this -march=armv7-a
into dg-options too or just leave it as is.

Jakub


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 7:36 PM, Aldy Hernandez  wrote:
>
>> Oh, and
>>
>>    INNERDECL is the actual object being referenced.
>>
>>       || (!ptr_deref_may_alias_global_p (innerdecl)
>>
>> is surely not what you want.  That asks if *innerdecl is global memory.
>> I suppose you want is_global_var (innerdecl)?  But with
>>
>>           &&  (DECL_THREAD_LOCAL_P (innerdecl)
>>               || !TREE_STATIC (innerdecl
>>
>> you can simply skip this test.  Or what was it supposed to do?
>
> The test was there because neither DECL_THREAD_LOCAL_P nor is_global_var can
> handle MEM_REF's.

Ok, in that case you want

  (TREE_CODE (innerdecl) == MEM_REF || TREE_CODE (innerdecl) == TARGET_MEM_REF)
  && !ptr_deref_may_alias_global_p (TREE_OPERAND (innerdecl, 0)))

which gets you at the actual pointer.

> Would you prefer an explicit check for a *_DECL?
>
>   if (ALLOW_STORE_DATA_RACES
> -      || (!ptr_deref_may_alias_global_p (innerdecl)
> +      || (DECL_P (innerdecl)
>          && (DECL_THREAD_LOCAL_P (innerdecl)
>              || !TREE_STATIC (innerdecl

Yes.  Together with the above it looks then optimal.

Richard.


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 7:19 PM, Andrew MacLeod  wrote:
> On 07/27/2011 01:08 PM, Aldy Hernandez wrote:
>>
>>> Anyway, I don't think a --param is appropriate to control a flag whether
>>> to allow store data-races to be created.  Why not use a regular option
>>> instead?
>>
>> I don't care either way.  What -foption-name do you suggest?
>
> Well, I suggested a -f option set last year when this was laid out, and Ian
> suggested that it should be a --param
>
> http://gcc.gnu.org/ml/gcc/2010-05/msg00118.html
>
> "I don't agree with your proposed command line options.  They seem fine
> for internal use, but I think very very few users would know when or
> whether they should use -fno-data-race-stores.  I think you should
> downgrade those options to a --param value, and think about a
> multi-layered -fmemory-model option. "

Hm, ok.  I suppose we can revisit this when implementing such -fmemory-model
option then.  --params we can at least freely remove between releases.

Richard.

> Andrew
>


Re: [PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Carrot Wei
Test case added.

Tested with
make check-gcc RUNTESTFLAGS="--target_board=arm-sim/thumb/arch=armv7-a
arm.exp=pr49799.c"
make check-gcc RUNTESTFLAGS="--target_board=arm-sim/arch=armv7-a
arm.exp=pr49799.c"

It fails without this patch and passes with this patch.

OK for trunk and 4.6 now?

thanks
Carrot

ChangeLog:
2011-07-28  Wei Guozhi  

PR rtl-optimization/49799
* combine.c (make_compound_operation): Check if the bit field is valid
before change it to bit field extraction.


ChangeLog:
2011-07-28  Wei Guozhi  

PR rtl-optimization/49799
* pr49799.c : New test case.


Index: gcc/combine.c
===
--- gcc/combine.c   (revision 176733)
+++ gcc/combine.c   (working copy)
@@ -7787,6 +7787,7 @@ make_compound_operation (rtx x, enum rtx
  && GET_CODE (lhs) == ASHIFT
  && CONST_INT_P (XEXP (lhs, 1))
  && INTVAL (rhs) >= INTVAL (XEXP (lhs, 1))
+ && INTVAL (XEXP (lhs, 1)) >= 0
  && INTVAL (rhs) < mode_width)
{
  new_rtx = make_compound_operation (XEXP (lhs, 0), next_code);


Index: pr49799.c
===
--- pr49799.c   (revision 0)
+++ pr49799.c   (revision 0)
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/49799 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -w -march=armv7-a" } */
+
+static __inline int bar(int a)
+{
+int tmp;
+
+if (a <= 0) a ^= 0x;
+
+return tmp - 1;
+}
+
+void foo(short *K)
+{
+short tmp;
+short *pptr, P[14];
+
+pptr = P;
+tmp = bar(*K);
+*pptr = (*K << tmp) >> 16;
+
+if (*P < tmp)
+*K++ = 0;
+}


On Thu, Jul 28, 2011 at 4:04 PM, Jakub Jelinek  wrote:
> On Thu, Jul 28, 2011 at 03:38:07PM +0800, Carrot Wei wrote:
>> OK for trunk and 4.6?
>>
>> ChangeLog:
>> 2011-07-28  Wei Guozhi  
>>
>>         PR rtl-optimization/49799
>>         * combine.c (make_compound_operation): Check if the bit field is 
>> valid
>>         before change it to bit field extraction.
>
> Looks good to me, handling SHIFT_COUNT_TRUNCATED here isn't IMHO necessary
> and the checking whether shift count is in the right range matches other rtx
> simplifications (e.g. in simplify-rtx.c).
> Though, you should add a testcase, probably
> /* PR rtl-optimization/49799 */
> /* { dg-do assemble } */
> /* { dg-options "-O2 -w" } */
>
> plus not sure if for arm you don't want to force this -march=armv7-a
> into dg-options too or just leave it as is.
>
>        Jakub
>


Re: [PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Jakub Jelinek
On Thu, Jul 28, 2011 at 04:40:53PM +0800, Carrot Wei wrote:
> ChangeLog:
> 2011-07-28  Wei Guozhi  
> 
> PR rtl-optimization/49799
> * pr49799.c : New test case.

Space shouldn't be between .c and :.  And the filename should be
relative to gcc/testsuite/ dir, so either gcc.target/arm/pr49799.c, or
better gcc.dg/pr49799.c.

Putting the testcase just into gcc.target/arm means it won't be tested
on other targets, while there is nothing arm specific about the testcase
except that you force -march in dg-options for arm.
You can do that with
/* PR rtl-optimization/49799 */
/* { dg-do assemble } */
/* { dg-options "-O2 -w" } */
/* { dg-options "-O2 -w -march=armv7-a" { target arm*-*-* } } */
or similar.

Ok with those changes.

Jakub


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Wed, 27 Jul 2011, Michael Matz wrote:

> Hi,
> 
> On Wed, 27 Jul 2011, Richard Guenther wrote:
> 
> > > > I don't think it is safe to try to get at the VLA type the way you do.
> > > 
> > > I don't understand in what way it's not safe. Do you mean I don't manage 
> > > to find
> > > the type always, or that I find the wrong type, or something else?
> > 
> > I think you might get the wrong type, you also do not transform code
> > like
> > 
> >   int *p = alloca(4);
> >   *p = 3;
> > 
> > as there is no array type involved here.
> 
> That's good, because you _can't_ transform that code into an array decl.  
> See:
> 
>for (int i = 0; i < 100; i++)
>  p[i] = alloca(4);
>assert (p[0] != p[1]);
> 
> vs.
>char vla_cst[4];
>for (int i = 0; i < 100; i++)
>  p[i] = &vla_cst;
>assert (p[0] != p[1]);

Hm, indeed ;)  At least not without more flow-sensitive analysis.

> Tom: you can reliably detect if an alloca call is for a VLA by checking 
> CALL_ALLOCA_FOR_VAR_P (on a tree call expression, but only if it's a 
> builtin call) or gimple_call_alloca_for_var_p (on a gimple call stmt).

Which actually hints that you should inline the folder into
gimple_fold_call/builtin where you still have this flag properly 
preserved.

Richard.


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Andreas Schwab
Paolo Carlini  writes:

> 2011-07-27  Paolo Carlini  
>
>   PR c++/49813
>   * include/c_global/cmath: Use _GLIBCXX_CONSTEXPR and constexpr.

I'm seeing this error on ia64:

/usr/local/gcc/gcc-20110728/Build/./gcc/xgcc -shared-libgcc 
-B/usr/local/gcc/gcc-20110728/Build/./gcc -nostdinc++ 
-L/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/src 
-L/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/src/.libs 
-B/usr/ia64-suse-linux/bin/ -B/usr/ia64-suse-linux/lib/ -isystem 
/usr/ia64-suse-linux/include -isystem /usr/ia64-suse-linux/sys-include-x 
c++-header -nostdinc++ -g -O2 -D_GNU_SOURCE 
-I/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/ia64-suse-linux
 -I/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include 
-I/usr/local/gcc/gcc-20110728/libstdc++-v3/libsupc++ -O2 -g -std=gnu++0x 
/usr/local/gcc/gcc-20110728/libstdc++-v3/include/precompiled/stdc++.h \
-o ia64-suse-linux/bits/stdc++.h.gch/O2ggnu++0x.gch
/usr/local/gcc/gcc-20110728/Build/./gcc/xgcc -shared-libgcc 
-B/usr/local/gcc/gcc-20110728/Build/./gcc -nostdinc++ 
-L/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/src 
-L/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/src/.libs 
-B/usr/ia64-suse-linux/bin/ -B/usr/ia64-suse-linux/lib/ -isystem 
/usr/ia64-suse-linux/include -isystem /usr/ia64-suse-linux/sys-include-x 
c++-header -nostdinc++ -g -O2 -D_GNU_SOURCE 
-I/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/ia64-suse-linux
 -I/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include 
-I/usr/local/gcc/gcc-20110728/libstdc++-v3/libsupc++ -O2 -g 
/usr/local/gcc/gcc-20110728/libstdc++-v3/include/precompiled/stdc++.h -o 
ia64-suse-linux/bits/stdc++.h.gch/O2g.gch
In file included from 
/usr/local/gcc/gcc-20110728/libstdc++-v3/include/precompiled/stdc++.h:42:0:
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath: 
In function 'constexpr float std::fma(float, float, float)':
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43:
 sorry, unimplemented: unexpected ast of kind fma_expr
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43:
 internal compiler error: in potential_constant_expression_1, at 
cp/semantics.c:8094

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."


Re: [PATCH] Fix PR47594: Sign extend constants while translating to Graphite

2011-07-28 Thread Richard Guenther
On Wed, 27 Jul 2011, Sebastian Pop wrote:

> On Tue, Jul 26, 2011 at 09:34, Richard Guenther  wrote:
> > Truncating -1 doesn't matter - it matters that if you perform any
> > unsigned arithmetic in arbitrary precision signed arithmetic that
> > you properly truncate after each operation to simulate unsigned
> > twos-complement wrapping semantic.  And if you did that you wouldn't
> > need to sign-extend -1U either.
> 
> Ok, so I guess that the type of the expression that we generate from
> Graphite should be, as the original expression, of unsigned type.
> In the previous example,
> 
> > for (scat_3=0;scat_3<=4294967295*scat_1+T_51-1;scat_3++) {
> >   S6(scat_1,scat_3);
> > }
> 
> this is still valid if the type of "4294967295*scat_1" is unsigned.

If 4294967295*scat_1+T_51-1 is always the symbolic number of
iterations then it will be always >= 0, right?  I still do not
quite understand where and how "types" enter the picture for
graphite here - if the niter expression was scat_1 + T_51 with
both unsigned then you'd still have to truncate to the result
types precision in case the polyhedral model internally has
infinite precision.  So I don't think -1U is in any way special
(it probably just appears more often, and we could avoid some
of the issues with folding the above to T_51 - 1 - scat_1).

> That would fix only -fgraphite-identity: we also have to watch out for
> operations on the polyhedral representation that would use -1U in
> other computations, and here I'm thinking about everything we have
> implemented on the polyhedral representation: dependence test,
> counting the number of points, i.e., all the heuristics, etc.
> 
> When disabling Graphite on all unsigned niter expressions, we get
> the following fails:

I think niter expressions are unsigned simply because niter will
always be >= 0.  But the issue doesn't seem to be the unsignedness of 
niter but the fact that the symbolic expression is computed with
unsigned arithmetic?

> FAIL: gcc.dg/graphite/scop-0.c scan-tree-dump-times graphite "number
> of SCoPs: 1" 1

Where I wonder why we end up with unsigned arithmetic for this
testcase for example.  2 * N + 100 is surely all signed.

[...]

> So the only solution that I can see is to implement the niter analysis
> as the resolution of a constraint system, and that would avoid creating
> the unsigned expressions.

So maybe we can instead try to avoid using unsigned arithmetic
for symbolic niters if the source does not have it unsigned?

Richard.

Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)

2011-07-28 Thread Georg-Johann Lay
Weddington, Eric wrote:
>> Subject: Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)
>>
>>> I didn't review the asm code, but the rest of the patch look ok to me.
>>>
>>> r~
>> Thanks, Eric will review the asm part  :-)
> 
> LOL
> I trust you on the asm stuff. Ok by me.

Ok, I installed it.  Don't forget to rebuild *all* your libraries
including avr-libc after upgrading!

> However, how is our test coverage in this area?
> 
> Eric

As I wrote in the initial mail

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02113.html

> The patch passes without regressions, of course.
> 
> Moreover, I drove individual tests of the routines against the old 
> implementation
> before integrating them into libgcc to run regression tests.

so I think there is reasonable test coverage.

libgcc.S already contained routines for widening multiply for the case when no
multiplier is available; these parts are dead code up to now; IMHO small devices
would benefit from supporting them in the compiler; in particular the two
16 = 8 * 8 cases.

I did not yet try to run the testsuite for a target without MUL, i.e. compile
for a target without MUL but simulate on ATmega128.

Did you ever run testsuite for a target without MUL?

Johann


[patch tree-optimization]: Remove TRUTH_NOT from vrp and adjust

2011-07-28 Thread Kai Tietz
Hello,

this patch removes the handling for TRUTH_NOT_EXPR in tree-vrp.c and adjust 
some simplifications to be compatible with BIT_NOT_EXPR.

ChangeLog

2011-07-28  Kai Tietz  

* tree-vrp.c (simplify_stmt_using_ranges): Remove
TRUTH_NOT_EXPR case.
(simplify_truth_ops_using_ranges): Likewise.
(build_assert_expr_for): Remove TRUTH_NOT_EXPR case
and handle BIT_NOT_EXPR for truth-operation.
(build_assert_expr_for_1): Likewise.

Bootstrapped and regression-tested for all languages (+ Ada, Obj-C++) on host 
x86_64-pc-linux-gnu.  Ok for apply?

Regards,
Kai

Index: gcc-head/gcc/tree-vrp.c
===
--- gcc-head.orig/gcc/tree-vrp.c
+++ gcc-head/gcc/tree-vrp.c
@@ -3972,7 +3972,8 @@ build_assert_expr_for (tree cond, tree v
   tree a = build2 (ASSERT_EXPR, TREE_TYPE (v), v, cond);
   assertion = gimple_build_assign (n, a);
 }
-  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
+  else if (TREE_CODE (cond) == BIT_NOT_EXPR
+  && TYPE_PRECISION (TREE_TYPE (cond)) == 1)
 {
   /* Given !V, build the assignment N = false.  */
   tree op0 = TREE_OPERAND (cond, 0);
@@ -4525,7 +4526,8 @@ register_edge_assert_for_1 (tree op, enu
   retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def),
code, e, bsi);
 }
-  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR)
+  else if (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
+  && TYPE_PRECISION (TREE_TYPE (gimple_assign_lhs (op_def))) == 1)
 {
   /* Recurse, flipping CODE.  */
   code = invert_tree_comparison (code, false);
@@ -6754,6 +6756,9 @@ simplify_truth_ops_using_ranges (gimple_
   bool sop = false;
   bool need_conversion;
 
+  /* We handle only !=/== case here.  */
+  gcc_assert (rhs_code == EQ_EXPR || rhs_code == NE_EXPR);
+
   op0 = gimple_assign_rhs1 (stmt);
   if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
 {
@@ -6770,52 +6775,40 @@ simplify_truth_ops_using_ranges (gimple_
 return false;
 }
 
-  if (rhs_code == TRUTH_NOT_EXPR)
+  op1 = gimple_assign_rhs2 (stmt);
+
+  /* Reduce number of cases to handle.  */
+  if (is_gimple_min_invariant (op1))
 {
-  rhs_code = NE_EXPR;
-  op1 = build_int_cst (TREE_TYPE (op0), 1);
+  if (!integer_zerop (op1)
+ && !integer_onep (op1)
+ && !integer_all_onesp (op1))
+   return false;
+
+  /* Limit the number of cases we have to consider.  */
+  if (rhs_code == EQ_EXPR)
+   {
+ rhs_code = NE_EXPR;
+ /* OP1 is a constant.  */
+ op1 = fold_unary (TRUTH_NOT_EXPR, TREE_TYPE (op1), op1);
+   }
 }
   else
 {
-  op1 = gimple_assign_rhs2 (stmt);
+  /* Punt on A == B as there is no BIT_XNOR_EXPR.  */
+  if (rhs_code == EQ_EXPR)
+   return false;
 
-  /* Reduce number of cases to handle.  */
-  if (is_gimple_min_invariant (op1))
+  if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
{
-  /* Exclude anything that should have been already folded.  */
- if (rhs_code != EQ_EXPR
- && rhs_code != NE_EXPR)
+ vr = get_value_range (op1);
+ val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, &sop);
+ if (!val || !integer_onep (val))
return false;
 
- if (!integer_zerop (op1)
- && !integer_onep (op1)
- && !integer_all_onesp (op1))
+ val = compare_range_with_value (LE_EXPR, vr, integer_one_node, &sop);
+ if (!val || !integer_onep (val))
return false;
-
- /* Limit the number of cases we have to consider.  */
- if (rhs_code == EQ_EXPR)
-   {
- rhs_code = NE_EXPR;
- op1 = fold_unary (TRUTH_NOT_EXPR, TREE_TYPE (op1), op1);
-   }
-   }
-  else
-   {
- /* Punt on A == B as there is no BIT_XNOR_EXPR.  */
- if (rhs_code == EQ_EXPR)
-   return false;
-
- if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
-   {
- vr = get_value_range (op1);
- val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, 
&sop);
- if (!val || !integer_onep (val))
-   return false;
-
- val = compare_range_with_value (LE_EXPR, vr, integer_one_node, 
&sop);
- if (!val || !integer_onep (val))
-   return false;
-   }
}
 }
 
@@ -7514,11 +7507,9 @@ simplify_stmt_using_ranges (gimple_stmt_
{
case EQ_EXPR:
case NE_EXPR:
-   case TRUTH_NOT_EXPR:
-  /* Transform EQ_EXPR, NE_EXPR, TRUTH_NOT_EXPR into BIT_XOR_EXPR
-or identity if the RHS is zero or one, and the LHS are known
-to be boolean values.  Transform all TRUTH_*_EXPR into
- BIT_*_EXPR if both arguments are known to be boolean values.  */
+  /* Transform EQ_EXPR, NE_EXPR into BIT_XOR_EXPR 

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 05:27 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> > 
> >> On 07/27/2011 02:12 PM, Richard Guenther wrote:
> >>> On Wed, 27 Jul 2011, Tom de Vries wrote:
> >>>
>  On 07/27/2011 01:50 PM, Tom de Vries wrote:
> > Hi Richard,
> >
> > I have a patch set for bug 43513 - The stack pointer is adjusted twice.
> >
> > 01_pr43513.3.patch
> > 02_pr43513.3.test.patch
> > 03_pr43513.3.mudflap.patch
> >
> > The patch set has been bootstrapped and reg-tested on x86_64.
> >
> > I will sent out the patches individually.
> >
> 
>  The patch replaces a vla __builtin_alloca that has a constant argument 
>  with an
>  array declaration.
> 
>  OK for trunk?
> >>>
> >>> I don't think it is safe to try to get at the VLA type the way you do.
> >>
> >> I don't understand in what way it's not safe. Do you mean I don't manage 
> >> to find
> >> the type always, or that I find the wrong type, or something else?
> > 
> > I think you might get the wrong type,
> 
> Ok, I'll review that code one more time.
> 
> > you also do not transform code
> > like
> > 
> >   int *p = alloca(4);
> >   *p = 3;
> > 
> > as there is no array type involved here.
> > 
> 
> I was trying to stay away from non-vla allocas.  A source declared alloca has
> function livetime, so we could have a single alloca in a loop, called 10 
> times,
> with all 10 instances live at the same time. This patch does not detect such
> cases, and thus stays away from non-vla allocas. A vla decl does not have such
> problems, the lifetime ends when it goes out of scope.

Yes indeed - that probably would require more detailed analysis.

> >>> In fact I would simply do sth like
> >>>
> >>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
> >>>   n_elem = size * 8 / BITS_PER_UNIT;
> >>>   array_type = build_array_type_nelts (elem_type, n_elem);
> >>>   var = create_tmp_var (array_type, NULL);
> >>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
> >>>
> >>
> >> I tried this code on the example, and it works, but the newly declared 
> >> type has
> >> an 8-bit alignment, while the vla base type has a 32 bit alignment.  This 
> >> make
> >> the memory access in the example potentially unaligned, which prohibits an
> >> ivopts optimization, so the resulting text size is 68 instead of the 64 
> >> achieved
> >> with my current patch.
> > 
> > Ok, so then set DECL_ALIGN of the variable to something reasonable
> > like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> > alignment that the targets alloca function would guarantee.
> > 
> 
> I tried that, but that doesn't help. It's the alignment of the type that
> matters, not of the decl.

It shouldn't.  All accesses are performed with the original types and
alignment comes from that (plus the underlying decl).

> So should we try to find the base type of the vla, and use that, or use the
> nonstandard char type?

I don't think we can reliably find the base type of the vla - well,
in practice we may because we control how we lower VLAs during
gimplification, but nothing in the IL constraints say that the
resulting pointer type should be special.

Using a char[] decl shouldn't be a problem IMHO.

> >>> And obviously you lose the optimization we arrange with inserting
> >>> __builtin_stack_save/restore pairs that way - stack space will no
> >>> longer be shared for subsequent VLAs.  Which means that you'd
> >>> better limit the size you allow this promotion.
> >>>
> >>
> >> Right, I could introduce a parameter for this.
> > 
> > I would think you could use PARAM_LARGE_STACK_FRAME for now and say,
> > allow a size of PARAM_LARGE_STACK_FRAME / 10?
> > 
> 
> That unfortunately is too small for the example from bug report. The default
> value of the param is 250, so that would be a threshold of 25, and the alloca
> size of the example is 40.  Perhaps we can try a threshold of
> PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such?

Hm.  estimated_stack_size is not O(1), so no.  I think we need to
find a sensible way of allowing stack sharing.  Eventually Michas
patch for introducing points-of-death would help here, if we'd
go for folding this during stack-save/restore optimization.

Richard.


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Uros Bizjak
Hello!

> convert_memory_address_addr_space has a special PLUS/MULT case for
> POINTERS_EXTEND_UNSIGNED < 0. ?It turns out that it is also needed
> for all Pmode != ptr_mode cases. ?OK for trunk?

> 2011-06-11 ?H.J. Lu ?
>
> ? ? ? ?PR middle-end/47727
> ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
> ? ? ? ?conversion and addition if one operand is a constant.

Do we still need this patch? With recent target changes the testcase
from PR can be compiled without problems with a gcc from an unpatched
trunk.

Uros.


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Paolo Bonzini

On 07/28/2011 11:30 AM, Uros Bizjak wrote:

>  convert_memory_address_addr_space has a special PLUS/MULT case for
>  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>  for all Pmode != ptr_mode cases. ?OK for trunk?
>  2011-06-11 ?H.J. Lu ?
>
>  ? ? ? ?PR middle-end/47727
>  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>  ? ? ? ?conversion and addition if one operand is a constant.

Do we still need this patch? With recent target changes the testcase
from PR can be compiled without problems with a gcc from an unpatched
trunk.


Given the communication difficulties, I hope not...

Paolo


Re: [patch tree-optimization]: Remove TRUTH_NOT from vrp and adjust

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 11:21 AM, Kai Tietz  wrote:
> Hello,
>
> this patch removes the handling for TRUTH_NOT_EXPR in tree-vrp.c and adjust 
> some simplifications to be compatible with BIT_NOT_EXPR.
>
> ChangeLog
>
> 2011-07-28  Kai Tietz  
>
>        * tree-vrp.c (simplify_stmt_using_ranges): Remove
>        TRUTH_NOT_EXPR case.
>        (simplify_truth_ops_using_ranges): Likewise.
>        (build_assert_expr_for): Remove TRUTH_NOT_EXPR case
>        and handle BIT_NOT_EXPR for truth-operation.
>        (build_assert_expr_for_1): Likewise.
>
> Bootstrapped and regression-tested for all languages (+ Ada, Obj-C++) on host 
> x86_64-pc-linux-gnu.  Ok for apply?
>
> Regards,
> Kai
>
> Index: gcc-head/gcc/tree-vrp.c
> ===
> --- gcc-head.orig/gcc/tree-vrp.c
> +++ gcc-head/gcc/tree-vrp.c
> @@ -3972,7 +3972,8 @@ build_assert_expr_for (tree cond, tree v
>       tree a = build2 (ASSERT_EXPR, TREE_TYPE (v), v, cond);
>       assertion = gimple_build_assign (n, a);
>     }
> -  else if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
> +  else if (TREE_CODE (cond) == BIT_NOT_EXPR
> +          && TYPE_PRECISION (TREE_TYPE (cond)) == 1)

As said previously at least two times (*sigh*), this is dead code.  Just
remove handling of TRUTH_NOT_EXPR here.

Ok with that change.

Thanks,
Richard.

>     {
>       /* Given !V, build the assignment N = false.  */
>       tree op0 = TREE_OPERAND (cond, 0);
> @@ -4525,7 +4526,8 @@ register_edge_assert_for_1 (tree op, enu
>       retval |= register_edge_assert_for_1 (gimple_assign_rhs2 (op_def),
>                                            code, e, bsi);
>     }
> -  else if (gimple_assign_rhs_code (op_def) == TRUTH_NOT_EXPR)
> +  else if (gimple_assign_rhs_code (op_def) == BIT_NOT_EXPR
> +          && TYPE_PRECISION (TREE_TYPE (gimple_assign_lhs (op_def))) == 1)
>     {
>       /* Recurse, flipping CODE.  */
>       code = invert_tree_comparison (code, false);
> @@ -6754,6 +6756,9 @@ simplify_truth_ops_using_ranges (gimple_
>   bool sop = false;
>   bool need_conversion;
>
> +  /* We handle only !=/== case here.  */
> +  gcc_assert (rhs_code == EQ_EXPR || rhs_code == NE_EXPR);
> +
>   op0 = gimple_assign_rhs1 (stmt);
>   if (TYPE_PRECISION (TREE_TYPE (op0)) != 1)
>     {
> @@ -6770,52 +6775,40 @@ simplify_truth_ops_using_ranges (gimple_
>         return false;
>     }
>
> -  if (rhs_code == TRUTH_NOT_EXPR)
> +  op1 = gimple_assign_rhs2 (stmt);
> +
> +  /* Reduce number of cases to handle.  */
> +  if (is_gimple_min_invariant (op1))
>     {
> -      rhs_code = NE_EXPR;
> -      op1 = build_int_cst (TREE_TYPE (op0), 1);
> +      if (!integer_zerop (op1)
> +         && !integer_onep (op1)
> +         && !integer_all_onesp (op1))
> +       return false;
> +
> +      /* Limit the number of cases we have to consider.  */
> +      if (rhs_code == EQ_EXPR)
> +       {
> +         rhs_code = NE_EXPR;
> +         /* OP1 is a constant.  */
> +         op1 = fold_unary (TRUTH_NOT_EXPR, TREE_TYPE (op1), op1);
> +       }
>     }
>   else
>     {
> -      op1 = gimple_assign_rhs2 (stmt);
> +      /* Punt on A == B as there is no BIT_XNOR_EXPR.  */
> +      if (rhs_code == EQ_EXPR)
> +       return false;
>
> -      /* Reduce number of cases to handle.  */
> -      if (is_gimple_min_invariant (op1))
> +      if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
>        {
> -          /* Exclude anything that should have been already folded.  */
> -         if (rhs_code != EQ_EXPR
> -             && rhs_code != NE_EXPR)
> +         vr = get_value_range (op1);
> +         val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, 
> &sop);
> +         if (!val || !integer_onep (val))
>            return false;
>
> -         if (!integer_zerop (op1)
> -             && !integer_onep (op1)
> -             && !integer_all_onesp (op1))
> +         val = compare_range_with_value (LE_EXPR, vr, integer_one_node, 
> &sop);
> +         if (!val || !integer_onep (val))
>            return false;
> -
> -         /* Limit the number of cases we have to consider.  */
> -         if (rhs_code == EQ_EXPR)
> -           {
> -             rhs_code = NE_EXPR;
> -             op1 = fold_unary (TRUTH_NOT_EXPR, TREE_TYPE (op1), op1);
> -           }
> -       }
> -      else
> -       {
> -         /* Punt on A == B as there is no BIT_XNOR_EXPR.  */
> -         if (rhs_code == EQ_EXPR)
> -           return false;
> -
> -         if (TYPE_PRECISION (TREE_TYPE (op1)) != 1)
> -           {
> -             vr = get_value_range (op1);
> -             val = compare_range_with_value (GE_EXPR, vr, integer_zero_node, 
> &sop);
> -             if (!val || !integer_onep (val))
> -               return false;
> -
> -             val = compare_range_with_value (LE_EXPR, vr, integer_one_node, 
> &sop);
> -             if (!val || !integer_onep (val))
> -               return false;
> -           }
>        }
>     }
>
> @@ -7514,11 +7507,9 @@ simplify_stmt_using_ranges (gimple_stmt_
> 

[PATCH] PR rtl-optimization/49884: get_last_value checks register mode before returning value

2011-07-28 Thread Paulo J. Matos

Hello,

This is an attempt to fix 49884. get_last_value checks the register mode 
before returning any value. If the register was set using a different 
mode than the one we are currently inquiring about then we avoid making 
a wrong guess about the current value and return 0.


PMatos

2011-07-28  Paulo J. Matos  

PR rtl-optimization/49884
* combine.c (get_last_value): For registers whose
value was set using a different mode than the one
we are currently inquiring about we return 0.
diff --git a/gcc/combine.c b/gcc/combine.c
index 4dbf022..5885f2a 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -12746,6 +12746,11 @@ get_last_value (const_rtx x)
   && DF_INSN_LUID (rsp->last_set) >= subst_low_luid)
 return 0;
 
+  /* If the value was set to the register when this was using a different mode
+ then we can't use it. */
+  if(rsp->last_set_mode != GET_MODE(x))
+return 0;
+
   /* If the value has all its registers valid, return it.  */
   if (get_last_value_validate (&value, rsp->last_set, rsp->last_set_label, 0))
 return value;


Re: [v3] Library bits of c++/49813

2011-07-28 Thread Paolo Carlini

Hi Andreas,
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath: 
In function 'constexpr float std::fma(float, float, float)': 
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43: 
sorry, unimplemented: unexpected ast of kind fma_expr 
/usr/local/gcc/gcc-20110728/Build/ia64-suse-linux/libstdc++-v3/include/cmath:1288:43: 
internal compiler error: in potential_constant_expression_1, at 
cp/semantics.c:8094
in the past we encountered already a few small problems of this kind, 
with cases missing from the potential_constant_expression_1 switch. I 
believe something quite close to what I'm attaching below should be 
enough, can you give it a try?


In any case, we definitely want Jason to have a look as soon as 
possible. If you want to restore the ia64 bootstrap in the meanwhile, 
feel free to comment out any troublesome constexpr specifier in that 
file (or replacing it with inline).


Thanks!
Paolo.

//


Index: semantics.c
===
--- semantics.c (revision 176846)
+++ semantics.c (working copy)
@@ -8057,6 +8057,13 @@ potential_constant_expression_1 (tree t, bool want
  return false;
   return true;
 
+case FMA_EXPR:
+  for (i = 0; i < 3; ++i)
+   if (!potential_constant_expression_1 (TREE_OPERAND (t, i),
+ true, flags))
+ return false;
+  return true;
+
 case COND_EXPR:
 case VEC_COND_EXPR:
   /* If the condition is a known constant, we know which of the legs we


Re: [PATCH, RFC] PR49749 biased reassociation for accumulator patterns

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 5:11 PM, William J. Schmidt
 wrote:
> This is a draft patch that biases the reassociation machinery so that
> each iteration of an accumulator pattern in a loop is independent of the
> other iterations.  This addresses a problem identified as an accidental
> side effect of the bug observed in PR tree-optimization/49749.  This
> patch reverses a substantial performance loss to 410.bwaves in cpu2006.
>
> I've restricted the bias to take place only for phi results that are
> identified as true accumulators within innermost loops.  Currently there
> is no restriction on the size or complexity of the loop, otherwise.
>
> I've bootstrapped and regression-tested this on powerpc64-linux with no
> new failures.  I'm still doing performance runs to assess the results,
> and may still need to tweak this.  It's close, though, and since I have
> upcoming vacation, I wanted to post this for comments now in hopes of
> wrapping this up by the end of the week.  Please let me know what you
> think.

The patch looks sensible to me, so if it shows good results in performance
testing it's ok for trunk with Michas comments implemneted.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> 2011-07-27  Bill Schmidt  
>
>        PR tree-optimization/49749
>        * tree-ssa-reassoc.c (get_rank): Add forward declaration.
>        (PHI_LOOP_BIAS): New macro.
>        (phi_rank): New function.
>        (phi_propagation_rank): Likewise.
>        (propagate_rank): Likewise.
>        (get_rank): Add calls to phi_rank and propagate_rank.
>
> Index: gcc/tree-ssa-reassoc.c
> ===
> --- gcc/tree-ssa-reassoc.c      (revision 176585)
> +++ gcc/tree-ssa-reassoc.c      (working copy)
> @@ -190,7 +190,118 @@ static long *bb_rank;
>  /* Operand->rank hashtable.  */
>  static struct pointer_map_t *operand_rank;
>
> +/* Forward decls.  */
> +static long get_rank (tree);
>
> +
> +/* Bias amount for loop-carried phis.  We want this to be larger than
> +   the depth of any reassociation tree we can see, but not larger than
> +   the rank difference between two blocks.  */
> +#define PHI_LOOP_BIAS (1 << 15)
> +
> +/* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
> +   an innermost loop, and the phi has only a single use which is inside
> +   the loop, then the rank is the block rank of the loop latch plus an
> +   extra bias for the loop-carried dependence.  This causes expressions
> +   calculated into an accumulator variable to be independent for each
> +   iteration of the loop.  If STMT is some other phi, the rank is the
> +   block rank of its containing block.  */
> +static long
> +phi_rank (gimple stmt)
> +{
> +  basic_block bb = gimple_bb (stmt);
> +  struct loop *father = bb->loop_father;
> +  tree res;
> +  unsigned i;
> +  use_operand_p use;
> +  gimple use_stmt;
> +
> +  /* We only care about real loops (those with a latch).  */
> +  if (!father->latch)
> +    return bb_rank[bb->index];
> +
> +  /* Interesting phis must be in headers of innermost loops.  */
> +  if (bb != father->header
> +      || father->inner)
> +    return bb_rank[bb->index];
> +
> +  /* Ignore virtual SSA_NAMEs.  */
> +  res = gimple_phi_result (stmt);
> +  if (!is_gimple_reg (SSA_NAME_VAR (res)))
> +    return bb_rank[bb->index];
> +
> +  /* The phi definition must have a single use, and that use must be
> +     within the loop.  Otherwise this isn't an accumulator pattern.  */
> +  if (!single_imm_use (res, &use, &use_stmt)
> +      || gimple_bb (use_stmt)->loop_father != father)
> +    return bb_rank[bb->index];
> +
> +  /* Look for phi arguments from within the loop.  If found, bias this phi.  
> */
> +  for (i = 0; i < gimple_phi_num_args (stmt); i++)
> +    {
> +      tree arg = gimple_phi_arg_def (stmt, i);
> +      if (TREE_CODE (arg) == SSA_NAME
> +         && !SSA_NAME_IS_DEFAULT_DEF (arg))
> +       {
> +         gimple def_stmt = SSA_NAME_DEF_STMT (arg);
> +         if (gimple_bb (def_stmt)->loop_father == father)
> +           return bb_rank[father->latch->index] + PHI_LOOP_BIAS;
> +       }
> +    }
> +
> +  /* Must be an uninteresting phi.  */
> +  return bb_rank[bb->index];
> +}
> +
> +/* If EXP is an SSA_NAME defined by a PHI statement that represents a
> +   loop-carried dependence of an innermost loop, return the block rank
> +   of the defining PHI statement.  Otherwise return zero.
> +
> +   The motivation for this is that we can't propagate the biased rank
> +   of the loop-carried phi, as this defeats the purpose of the bias.
> +   However, the rank of a value that depends on the result of a loop-
> +   carried phi should still be higher than the rank of a value that
> +   depends on values from more distant blocks.  */
> +static long
> +phi_propagation_rank (tree exp)
> +{
> +  gimple phi_stmt;
> +  long block_rank;
> +
> +  if (TREE_CODE (exp) != SSA_NAME
> +      || SSA_NAME_IS_DEFAULT_DEF (exp))
> +    return 0;
> +
> +  phi_stmt = SSA_NAME_DEF_S

Re: Mention avx2 patch

2011-07-28 Thread Gerald Pfeifer
On Thu, 28 Jul 2011, Kirill Yukhin wrote:
> Ping

Oh, sure.  I had somehow thought this had been applied already.

Instead of just removing ix86/avx, would you mind moving it to
the "Inactive Development Branches" section?

Gerald


Re: [PATCH 4/6] Shrink-wrapping

2011-07-28 Thread Bernd Schmidt
On 07/21/11 11:52, Richard Sandiford wrote:
> The name "active_insn_after" seems a bit too similar to "next_active_insn"
> for the difference to be obvious.  How about something like
> "first_active_target_insn" instead?

Changed.
>> -  for (; insn; insn = next)
>> +  for (; insn && !ANY_RETURN_P (insn); insn = next)
>>  {
>>if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE)
>>  insn = XVECEXP (PATTERN (insn), 0, 0);
> 
> Since ANY_RETURN looks for patterns, while this loop iterates over insns,
> I think it'd be more obvious to have:
> 
>   if (insn && ANY_RETURN_P (insn))
> return 1;
> 
> above the loop instead

That alone wouldn't work since we assign JUMP_LABELs to next. Left alone
for now.

>> --- gcc/jump.c   (revision 176230)
>> +++ gcc/jump.c   (working copy)
>> @@ -1217,7 +1217,7 @@ delete_related_insns (rtx insn)
> 
> Given what you said above, and given that this is a public function,
> I think we should keep the null check.

Changed.
> 
> This pattern came up in reorg.c too, so maybe it would be worth having
> a jump_to_label_p inline function somewhere, such as:

Done. Only has two uses for now though; reorg.c uses different patterns
mostly.

> It looks like the old code tried to allow returns to be redirected
> to a label -- (return) to (set (pc) (label_ref)) -- whereas the new
> code doesn't. [...]
> 
> How about:
> 
>   x = redirect_target (nlabel);
>   if (GET_CODE (x) == LABEL_REF && loc == &PATTERN (insn))
>   x = gen_rtx_SET (VOIDmode, pc_rtx, x);
>   validate_change (insn, loc, x, 1);

Changed, although this probably isn't a useful thing to allow (it will
just add one more unnecessary jump to the code?).

[ifcvt changes]
> I found the placement of this code a bit confusing as things stand.
> new_dest_label is only meaningful if other_bb != new_dest, so it seemed
> like something that should directly replace the existing new_label
> assignment.  It's OK if it makes the shrink-wrap stuff easier though.

Changed.

>> @@ -1195,6 +1195,9 @@ duplicate_insn_chain (rtx from, rtx to)
>>break;
>>  }
>>copy = emit_copy_of_insn_after (insn, get_last_insn ());
>> +  if (JUMP_P (insn) && JUMP_LABEL (insn) != NULL_RTX
>> +  && ANY_RETURN_P (JUMP_LABEL (insn)))
>> +JUMP_LABEL (copy) = JUMP_LABEL (insn);
> 
> I think this should go in emit_copy_of_insn_after instead.

Here I'd like to avoid modifying the existing code in
emit_copy_of_insn_after if possible. Not sure why it's not copying
JUMP_LABELS, but that's something I'd prefer to investigate at some
other time rather than risk breaking things.

>> @@ -2294,6 +2294,8 @@ create_cfi_notes (void)
>>dwarf2out_frame_debug (insn, false);
>>continue;
>>  }
>> +  if (GET_CODE (pat) == ADDR_VEC || GET_CODE (pat) == ADDR_DIFF_VEC)
>> +continue;
>>  
>>if (GET_CODE (pat) == SEQUENCE)
>>  {
> 
> rth better approve this bit...

It went away.

New patch below. Retested on i686-linux and mips64-elf. Ok?


Bernd
* rtlanal.c (tablejump_p): False for returns.
* reorg.c (first_active_target_insn): New static function.
(find_end_label): Set JUMP_LABEL for a new returnjump.
(optimize_skip, get_jump_flags, rare_destination,
mostly_true_jump, get_branch_condition,
steal_delay_list_from_target, own_thread_p,
fill_simple_delay_slots, follow_jumps, fill_slots_from_thread,
fill_eager_delay_slots, relax_delay_slots, make_return_insns,
dbr_schedule): Adjust to handle ret_rtx in JUMP_LABELs.
* jump.c (delete_related_insns): Likewise.
(jump_to_label_p): New function.
(redirect_target): New static function.
(redirect_exp_1): Use it.  Adjust to handle ret_rtx in JUMP_LABELS.
(redirect_jump_1): Assert that the new label is nonnull.
(redirect_jump): Likewise.
(redirect_jump_2): Check for ANY_RETURN_P rather than NULL labels.
* ifcvt.c (find_if_case_1): Take care when redirecting jumps to the
exit block.
(dead_or_predicable): Change NEW_DEST arg to DEST_EDGE.  All callers
changed.  Ensure that the right label is passed to redirect_jump.
* function.c (emit_return_into_block,
thread_prologue_and_epilogue_insns): Ensure new returnjumps have
ret_rtx in their JUMP_LABEL.
* print-rtl.c (print_rtx): Handle ret_rtx in a JUMP_LABEL.
* emit-rtl.c (skip_consecutive_labels): Allow the caller to
pass ret_rtx as label.
* cfglayout.c (fixup_reorder_chain): Use
force_nonfallthru_and_redirect rather than force_nonfallthru.
(duplicate_insn_chain): Copy JUMP_LABELs for returns.
* rtl.h (ANY_RETURN_P): New macro.
(jump_to_label_p): Declare.
* resource.c (find_dead_or_set_registers): Handle ret_rtx in
JUMP_LABELs.
(mark_target_live_regs): Likewise.
* basic-block.h (force_nonfallthru_an

[Revert,Committed,AVR]: Undo r176835

2011-07-28 Thread Georg-Johann Lay
The two patches

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02424.html (PR target/49313)
committed as r176835

and

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02391.html (PR target/49687)
committed as r176862

are incompatible.

The reason is that the first contains expanders that emit

(set (reg:SI)
 (zero_extend:SI (reg:HI 24)))

which are not allowed as of the second patch which disallows hard regs
before reload in sign- and zero-extend.

I reverted r176835 and postpone the changes until there is a proper
solution for representing implicit library calls and fiddling around
with hard regs, i.e. there are register classes/constrains to express
this.

Preferring widening multiply over rarely used built-ins,
I reverted the first change:

http://gcc.gnu.org/viewcvs?view=revision&revision=176865

Sorry for the inconvenience.

Johann

PR target/49313
Undo r176835 from trunk
2011-07-27  Georg-Johann Lay


Index: config/avr/libgcc.S
===
--- config/avr/libgcc.S (revision 176864)
+++ config/avr/libgcc.S (working copy)
@@ -1074,15 +1074,9 @@ ENDF __ffssi2
 ;; clobbers: r26
 DEFUN __ffshi2
 clr  r26
-#ifdef __AVR_HAVE_JMP_CALL__
-;; Some cores have problem skipping 2-word instruction
-tst  r24
-breq 2f
-#else
 cpse r24, __zero_reg__
-#endif /* __AVR_HAVE_JMP_CALL__ */
 1:  XJMP __loop_ffsqi2
-2:  ldi  r26, 8
+ldi  r26, 8
 or   r24, r25
 brne 1b
 ret
@@ -1112,12 +1106,12 @@ ENDF __loop_ffsqi2
 #if defined (L_ctzsi2)
 ;; count trailing zeros
 ;; r25:r24 = ctz32 (r25:r22)
-;; clobbers: r26, r22
-;; ctz(0) = 255
-;; Note that ctz(0) in undefined for GCC
+;; ctz(0) = 32
 DEFUN __ctzsi2
 XCALL __ffssi2
 dec  r24
+sbrc r24, 7
+ldi  r24, 32
 ret
 ENDF __ctzsi2
 #endif /* defined (L_ctzsi2) */
@@ -1125,12 +1119,12 @@ ENDF __ctzsi2
 #if defined (L_ctzhi2)
 ;; count trailing zeros
 ;; r25:r24 = ctz16 (r25:r24)
-;; clobbers: r26
-;; ctz(0) = 255
-;; Note that ctz(0) in undefined for GCC
+;; ctz(0) = 16
 DEFUN __ctzhi2
 XCALL __ffshi2
 dec  r24
+sbrc r24, 7
+ldi  r24, 16
 ret
 ENDF __ctzhi2
 #endif /* defined (L_ctzhi2) */
@@ -1264,50 +1258,47 @@ ENDF __parityqi2
 #if defined (L_popcounthi2)
 ;; population count
 ;; r25:r24 = popcount16 (r25:r24)
-;; clobbers: __tmp_reg__
+;; clobbers: r30, __tmp_reg__
 DEFUN __popcounthi2
 XCALL __popcountqi2
-push r24
+mov  r30, r24
 mov  r24, r25
 XCALL __popcountqi2
+add  r24, r30
 clr  r25
-;; FALLTHRU
-ENDF __popcounthi2
-
-DEFUN __popcounthi2_tail
-pop   __tmp_reg__
-add   r24, __tmp_reg__
 ret
-ENDF __popcounthi2_tail
+ENDF __popcounthi2
 #endif /* defined (L_popcounthi2) */

 #if defined (L_popcountsi2)
 ;; population count
 ;; r25:r24 = popcount32 (r25:r22)
-;; clobbers: __tmp_reg__
+;; clobbers: r26, r30, __tmp_reg__
 DEFUN __popcountsi2
 XCALL __popcounthi2
-push  r24
+mov   r26, r24
 mov_l r24, r22
 mov_h r25, r23
 XCALL __popcounthi2
-XJMP  __popcounthi2_tail
+add   r24, r26
+ret
 ENDF __popcountsi2
 #endif /* defined (L_popcountsi2) */

 #if defined (L_popcountdi2)
 ;; population count
 ;; r25:r24 = popcount64 (r25:r18)
-;; clobbers: r22, r23, __tmp_reg__
+;; clobbers: r22, r23, r26, r27, r30, __tmp_reg__
 DEFUN __popcountdi2
 XCALL __popcountsi2
-push  r24
+mov   r27, r24
 mov_l r22, r18
 mov_h r23, r19
 mov_l r24, r20
 mov_h r25, r21
 XCALL __popcountsi2
-XJMP  __popcounthi2_tail
+add   r24, r27
+ret
 ENDF __popcountdi2
 #endif /* defined (L_popcountdi2) */

Index: config/avr/avr.md
===
--- config/avr/avr.md   (revision 176864)
+++ config/avr/avr.md   (working copy)
@@ -55,7 +55,6 @@ (define_c_enum "unspec"
UNSPEC_FMUL
UNSPEC_FMULS
UNSPEC_FMULSU
-   UNSPEC_COPYSIGN
])

 (define_c_enum "unspecv"
@@ -3942,275 +3941,6 @@ (define_insn "delay_cycles_4"
   [(set_attr "length" "9")
(set_attr "cc" "clobber")])

-
-;; Parity
-
-(define_expand "parityhi2"
-  [(set (reg:HI 24)
-(match_operand:HI 1 "register_operand" ""))
-   (set (reg:HI 24)
-(parity:HI (reg:HI 24)))
-   (set (match_operand:HI 0 "register_operand" "")
-(reg:HI 24))]
-  ""
-  "")
-
-(define_expand "paritysi2"
-  [(set (reg:SI 22)
-(match_operand:SI 1 "register_operand" ""))
-   (set (reg:HI 24)
-(parity:HI (reg:SI 22)))
-   (set (match_operand:SI 0 "register_operand" "")
-(zero_extend:SI (reg:HI 24)))]
-  ""
-  "")
-
-(define_insn "*parityhi2.libgcc"
-  [(set (reg:HI 24)
-(parity:HI (reg:HI 24)))]
-  ""
-  "%~call __parityhi2"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-(define_insn "*parityqihi2.libgcc"
-  [(set (reg:HI 24)
-(parity:HI (reg:QI 24)))]
-  ""
-  "%~call __parityqi2"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-(define_insn "*paritysi

[MELT] split_string_* functions now take a value

2011-07-28 Thread Romain Geissler
Hi,

I changed the argument type for the cs argument in split_string functions.
Indeed, there is no way to access the boxed string from a value, and sometimes
we can't avoid working with boxed strings. As we can box a :cstring in a value,
split_string_* functions are usable in all cases.

Find attach the patch. Please note that it is a git patch, thus it should be
applied with git-apply or git-am.

Romain.


0001-split_string_-functions-now-require-a-value-string-i.Changelog
Description: Binary data
From 41290f00ee0cb6bdeaf2254b0d6a25ebddf23a65 Mon Sep 17 00:00:00 2001
From: Romain Geissler 
Date: Thu, 28 Jul 2011 12:43:59 +0200
Subject: [PATCH] split_string_* functions now require a value string instead
 of a :cstring.

---
 gcc/melt/warmelt-base.melt   |   16 
 gcc/melt/warmelt-outobj.melt |4 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/melt/warmelt-base.melt b/gcc/melt/warmelt-base.melt
index 8a6ff9d..c7eca3e 100644
--- a/gcc/melt/warmelt-base.melt
+++ b/gcc/melt/warmelt-base.melt
@@ -570,25 +570,25 @@ an integer $I if $I is lower than $N.}#
   #{(meltgc_string_hex_md5sum_file_sequence ((melt_ptr_t) $PATHSEQ))}#)
 
 
-(defprimitive split_string_space (dis :cstring cs) :value
+(defprimitive split_string_space (dis cs) :value
   :doc #{Split a cstring $CS into a list of space separated strings of
 discriminant $DIS.}#
-#{meltgc_new_split_string($cs, ' ', (melt_ptr_t) $dis)}#)
+#{meltgc_new_split_string(melt_string_str($cs), ' ', (melt_ptr_t) $dis)}#)
 
-(defprimitive split_string_comma (dis :cstring cs) :value
+(defprimitive split_string_comma (dis cs) :value
   :doc #{Split a cstring $CS into a list of comma separated strings of
 discriminant $DIS.}#
-#{meltgc_new_split_string($cs, ',', (melt_ptr_t) $dis)}#)
+#{meltgc_new_split_string(melt_string_str($cs), ',', (melt_ptr_t) $dis)}#)
 
-(defprimitive split_string_colon (dis :cstring cs) :value
+(defprimitive split_string_colon (dis cs) :value
   :doc #{Split a cstring $CS into a list of colon separated strings of
 discriminant $DIS.}#
-#{meltgc_new_split_string($cs, ':', (melt_ptr_t)$dis)}#)
+#{meltgc_new_split_string(melt_string_str($cs), ':', (melt_ptr_t)$dis)}#)
 
-(defprimitive split_string_equal (dis :cstring cs) :value
+(defprimitive split_string_equal (dis cs) :value
   :doc #{Split a cstring $CS into a list of equal separated strings of
 discriminant $DIS.}#
-#{meltgc_new_split_string($cs, '=', (melt_ptr_t)$dis)}#)
+#{meltgc_new_split_string(melt_string_str($cs), '=', (melt_ptr_t)$dis)}#)
 
 ;;; convert a strbuf into a string
 (defprimitive strbuf2string (dis sbuf) :value
diff --git a/gcc/melt/warmelt-outobj.melt b/gcc/melt/warmelt-outobj.melt
index dd1cdde..df4cf06 100644
--- a/gcc/melt/warmelt-outobj.melt
+++ b/gcc/melt/warmelt-outobj.melt
@@ -4671,7 +4671,7 @@ has basic debug support thru debug_msg, assert_msg..."
 	 (inarg (cond ( progarg 
 			(make_stringconst discr_string progarg))
 		  ( progarglist
-			 (split_string_comma discr_string progarglist)
+			 (split_string_comma discr_string (make_stringconst discr_string progarglist))
 			)
 		  (:else
 		   (errormsg_plain "invalid arg or arglist to translateinit mode")
@@ -5800,7 +5800,7 @@ has basic debug support thru debug_msg, assert_msg..."
   (let ( 
 	(parmodenv (parent_module_environment))
 	(curenv (if moduldata moduldata initial_environment))
-	(arglist (split_string_comma discr_string (melt_argument "arglist")))
+	(arglist (split_string_comma discr_string (make_stringconst discr_string (melt_argument "arglist"
 	(outarg (make_stringconst discr_string (melt_argument "output")))
 	(rlist (make_list discr_list))
 	(mdinfo 
-- 
1.7.6



Re: [PATCH 4/6] Shrink-wrapping

2011-07-28 Thread Richard Sandiford
Bernd Schmidt  writes:
>>> -  for (; insn; insn = next)
>>> +  for (; insn && !ANY_RETURN_P (insn); insn = next)
>>>  {
>>>if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE)
>>> insn = XVECEXP (PATTERN (insn), 0, 0);
>> 
>> Since ANY_RETURN looks for patterns, while this loop iterates over insns,
>> I think it'd be more obvious to have:
>> 
>>   if (insn && ANY_RETURN_P (insn))
>> return 1;
>> 
>> above the loop instead
>
> That alone wouldn't work since we assign JUMP_LABELs to next.

Doh

> Left alone for now.

OK.

>> This pattern came up in reorg.c too, so maybe it would be worth having
>> a jump_to_label_p inline function somewhere, such as:
>
> Done. Only has two uses for now though; reorg.c uses different patterns
> mostly.

There are a few other natural uses too though (below).

>>> @@ -1195,6 +1195,9 @@ duplicate_insn_chain (rtx from, rtx to)
>>>   break;
>>> }
>>>   copy = emit_copy_of_insn_after (insn, get_last_insn ());
>>> + if (JUMP_P (insn) && JUMP_LABEL (insn) != NULL_RTX
>>> + && ANY_RETURN_P (JUMP_LABEL (insn)))
>>> +   JUMP_LABEL (copy) = JUMP_LABEL (insn);
>> 
>> I think this should go in emit_copy_of_insn_after instead.
>
> Here I'd like to avoid modifying the existing code in
> emit_copy_of_insn_after if possible. Not sure why it's not copying
> JUMP_LABELS, but that's something I'd prefer to investigate at some
> other time rather than risk breaking things.

OK.

> New patch below. Retested on i686-linux and mips64-elf. Ok?

Looks good to me, thanks.  OK with:

> @@ -2757,7 +2770,8 @@ fill_slots_from_thread (rtx insn, rtx co
> gcc_assert (REG_NOTE_KIND (note)
> == REG_LABEL_OPERAND);
> }
> -   if (JUMP_P (trial) && JUMP_LABEL (trial))
> +   if (JUMP_P (trial) && JUMP_LABEL (trial)
> +   && !ANY_RETURN_P (JUMP_LABEL (trial)))
>   LABEL_NUSES (JUMP_LABEL (trial))++;

jump_to_label_p here.

> @@ -2776,7 +2790,8 @@ fill_slots_from_thread (rtx insn, rtx co
> gcc_assert (REG_NOTE_KIND (note)
> == REG_LABEL_OPERAND);
> }
> -   if (JUMP_P (trial) && JUMP_LABEL (trial))
> +   if (JUMP_P (trial) && JUMP_LABEL (trial)
> +   && !ANY_RETURN_P (JUMP_LABEL (trial)))
>   LABEL_NUSES (JUMP_LABEL (trial))--;

and here.

> Index: gcc/config/sh/sh.c
> ===
> --- gcc/config/sh/sh.c(revision 176838)
> +++ gcc/config/sh/sh.c(working copy)
> @@ -5276,7 +5276,8 @@ barrier_align (rtx barrier_or_label)
>   }
>if (prev
> && JUMP_P (prev)
> -   && JUMP_LABEL (prev))
> +   && JUMP_LABEL (prev) != NULL_RTX
> +   && !ANY_RETURN_P (JUMP_LABEL (prev)))
>   {
> rtx x;
> if (jump_to_next

and here.

> Index: gcc/config/arm/arm.c
> ===
> --- gcc/config/arm/arm.c  (revision 176838)
> +++ gcc/config/arm/arm.c  (working copy)
> @@ -11479,6 +11479,7 @@ is_jump_table (rtx insn)
>  
>if (GET_CODE (insn) == JUMP_INSN
>&& JUMP_LABEL (insn) != NULL
> +  && !ANY_RETURN_P (JUMP_LABEL (insn))
>&& ((table = next_real_insn (JUMP_LABEL (insn)))
> == next_real_insn (insn))

and here.

Richard


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Richard Guenther
On Wed, Jul 27, 2011 at 5:03 PM, Richard Guenther
 wrote:
> On Wed, Jul 27, 2011 at 4:56 PM, Richard Guenther
>  wrote:
>> On Wed, Jul 27, 2011 at 4:52 PM, Richard Guenther
>>  wrote:
>>> On Tue, Jul 26, 2011 at 7:38 PM, Jason Merrill  wrote:
 On 07/26/2011 10:32 AM, Aldy Hernandez wrote:
>
>> I think the adjustment above is intended to match the adjustment of the
>> address by bitregion_start/BITS_PER_UNIT, but the above seems to assume
>> that bitregion_start%BITS_PER_UNIT == 0.
>
> That was intentional. bitregion_start always falls on a byte boundary,
> does it not?

 Ah, yes, of course, it's bitnum that might not.  The code changes look 
 good,
 then.
>>>
>>> Looks like this was an approval ...
>>>
>>> Anyway, I don't think a --param is appropriate to control a flag whether
>>> to allow store data-races to be created.  Why not use a regular option 
>>> instead?
>>>
>>> I believe that any after-the-fact attempt to recover bitfield boundaries is
>>> going to fail unless you preserve more information during bitfield layout.
>>>
>>> Consider
>>>
>>> struct {
>>>  char : 8;
>>>  char : 0;
>>>  char : 8;
>>> };
>>>
>>> where the : 0 isn't preserved in any way and you can't distinguish
>>> it from struct { char : 8; char : 8; }.
>>
>> Oh, and
>>
>>   INNERDECL is the actual object being referenced.
>>
>>      || (!ptr_deref_may_alias_global_p (innerdecl)
>>
>> is surely not what you want.  That asks if *innerdecl is global memory.
>> I suppose you want is_global_var (innerdecl)?  But with
>>
>>          && (DECL_THREAD_LOCAL_P (innerdecl)
>>              || !TREE_STATIC (innerdecl
>>
>> you can simply skip this test.  Or what was it supposed to do?
>
> And
>
>      t = build3 (COMPONENT_REF, TREE_TYPE (exp),
>                  unshare_expr (TREE_OPERAND (exp, 0)),
>                  fld, NULL_TREE);
>      get_inner_reference (t, &bitsize, &bitpos, &offset,
>                           &mode, &unsignedp, &volatilep, true);
>
> for each field of a struct type is of course ... gross!  In fact you already
> have the FIELD_DECL in the single caller!  Yes I know there is not
> enough information preserved by bitfield layout - see my previous reply.

Looking at the C++ memory model what you need is indeed simple enough
to recover here.  Still this loop does quadratic work for a struct with
N bitfield members and a function which stores into all of them.
And that with a big constant factor as you build a component-ref
and even unshare trees (which isn't necessary here anyway).  In fact
you could easily manually keep track of bitpos when walking adjacent
bitfield members.  An initial call to get_inner_reference on
TREE_OPERAND (exp, 0) would give you the starting position of the record.

That would still be quadratic of course.

For bitfield lowering I'd like to preserve a way to get from a field-decl to
the first field-decl of a group of bitfield members that occupy an aligned
amount of storage (as place_field assigns it).  That wouldn't necessarily
match the first bitfield field in the C++ bitfield group sense but would
probably be sensible enough for conforming accesses (and you'd only
need to search forward from that first field looking for a zero-size
field).  Now, the question is of course what to do for DECL_PACKED
fields (I suppose, simply ignore the C++ memory model as C++ doesn't
have a notion of packed or specially (mis-)aligned structs or bitfields).

Richard.


[PATCH] unbreak attribute((optimize(...))) on m68k (PR47908)

2011-07-28 Thread Mikael Pettersson
On m68k-linux, gcc ICEs on any occurrence of attribute((optimize(...)))
ever since gcc-4.4.  Default optimization flags enable scheduling, which
the backend doesn't support for non-ColdFire targets.  The backend disables
scheduling for non-ColdFire targets when processing command-line options,
but fails to do so for the attribute optimize case, causing the ICE.

The fix for 4.6/4.7 is to hook TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE and
disable scheduling for non-ColdFire CPUs.

Tested on 4.6/4.7, no regressions but several test cases (see PR47908)
that previosuly failed now succeed.

Ok for trunk and 4.6?

(4.5 and 4.4 need a slightly different fix which I can provide if needed.)

/Mikael


gcc/

2011-07-28  Mikael Pettersson  

PR target/47908
* config/m68k/m68k.c (m68k_override_options_after_change): New function.
Disable instruction scheduling for non-ColdFire targets.
(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Define.

--- gcc-4.7-20110716/gcc/config/m68k/m68k.c.~1~ 2011-06-16 15:45:47.0 
+0200
+++ gcc-4.7-20110716/gcc/config/m68k/m68k.c 2011-07-28 11:26:57.0 
+0200
@@ -136,6 +136,7 @@ static bool m68k_can_eliminate (const in
 static void m68k_conditional_register_usage (void);
 static bool m68k_legitimate_address_p (enum machine_mode, rtx, bool);
 static void m68k_option_override (void);
+static void m68k_override_options_after_change (void);
 static rtx find_addr_reg (rtx);
 static const char *singlemove_string (rtx *);
 static void m68k_output_mi_thunk (FILE *, tree, HOST_WIDE_INT,
@@ -235,6 +236,9 @@ static bool m68k_cannot_force_const_mem 
 #undef TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE m68k_option_override
 
+#undef TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE
+#define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE m68k_override_options_after_change
+
 #undef TARGET_RTX_COSTS
 #define TARGET_RTX_COSTS m68k_rtx_costs
 
@@ -634,6 +638,17 @@ m68k_option_override (void)
 }
 }
 
+static void
+m68k_override_options_after_change (void)
+{
+  if (m68k_sched_cpu == CPU_UNKNOWN)
+{
+  flag_schedule_insns = 0;
+  flag_schedule_insns_after_reload = 0;
+  flag_modulo_sched = 0;
+}
+}
+
 /* Generate a macro of the form __mPREFIX_cpu_NAME, where PREFIX is the
given argument and NAME is the argument passed to -mcpu.  Return NULL
if -mcpu was not passed.  */


Re: [Patch ARM] Rejig constraint order in *movdf_vfp and *thumb2_movdf_vfp patterns.

2011-07-28 Thread Ramana Radhakrishnan



 Ok to commit ?


Richard was ok with this offline and the tests showed no regressions. So 
I've committed this today.


Ramana


Re: [PATCH] PR49799: Don't generate illegal bit field extraction instruction

2011-07-28 Thread Richard Earnshaw
On 28/07/11 09:40, Carrot Wei wrote:
> Index: pr49799.c
> ===
> --- pr49799.c (revision 0)
> +++ pr49799.c (revision 0)
> @@ -0,0 +1,25 @@
> +/* PR rtl-optimization/49799 */
> +/* { dg-do assemble } */
> +/* { dg-options "-O2 -w -march=armv7-a" } */

No, don't force the architecture like this.  Just let multilib variant
handling deal with it.

Once you've done that, then this test isn't cpu specific and can be
moved to c-torture/compile.

R.



Re: PATCH: PR target/47364: [x32] internal compiler error: in emit_move_insn, at expr.c:3355

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:30 AM, Uros Bizjak  wrote:

>> We should only expand strlen to Pmode.  Otherwise, we got
>>
>> [hjl@gnu-6 ilp32-38]$ cat x.i
>> char one[50] = "ijk";
>> int
>> main (void)
>> {
>>  return __builtin_strlen (one) != 3;
>> }
>> [hjl@gnu-6 ilp32-38]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc 
>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O2 x.i
>> x.i: In function ‘main’:
>> x.i:5:27: internal compiler error: in emit_move_insn, at expr.c:
>> Please submit a full bug report,
>> with preprocessed source if appropriate.
>> See  for instructions.
>>
>> OK for trunk?
>>
>> 2011-07-27  H.J. Lu  
>>
>>        PR target/47364
>>        * config/i386/i386.md (strlen): Replace SWI48x with P.
>
> OK.

Please also backport this fix to release branches.

Thanks,
Uros.


Re: PATCH: PR target/47364: [x32] internal compiler error: in emit_move_insn, at expr.c:3355

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 5:44 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:30 AM, Uros Bizjak  wrote:
>
>>> We should only expand strlen to Pmode.  Otherwise, we got
>>>
>>> [hjl@gnu-6 ilp32-38]$ cat x.i
>>> char one[50] = "ijk";
>>> int
>>> main (void)
>>> {
>>>  return __builtin_strlen (one) != 3;
>>> }
>>> [hjl@gnu-6 ilp32-38]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc 
>>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O2 x.i
>>> x.i: In function ‘main’:
>>> x.i:5:27: internal compiler error: in emit_move_insn, at expr.c:
>>> Please submit a full bug report,
>>> with preprocessed source if appropriate.
>>> See  for instructions.
>>>
>>> OK for trunk?
>>>
>>> 2011-07-27  H.J. Lu  
>>>
>>>        PR target/47364
>>>        * config/i386/i386.md (strlen): Replace SWI48x with P.
>>
>> OK.
>
> Please also backport this fix to release branches.
>

I checked it into GCC 4.6. GCC 4.5 is very different and I didn't change it.

Thanks.


-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 12:45 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 5:11 AM, H.J. Lu  wrote:
>
>> In x32, thread pointer is 32bit and choice of segment register for the
>> thread base ptr load should be based on TARGET_64BIT.  This patch
>> implements it.  OK for trunk?
>
> -ENOTESTCASE.
>

There is no standalone testcase.  The symptom is in glibc build, I
got

CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
 -E -x c-header'
/export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
--library-path 
/export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
../scripts -h rpcsvc/yppasswd.x -o
/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
make[5]: *** 
[/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
Segmentation fault
make[5]: *** Waiting for unfinished jobs
make[5]: *** [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
Segmentation fault
make[5]: *** 
[/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
Segmentation fault

since thread pointer is 32bit in x32.


-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 6:08 AM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 12:45 AM, Uros Bizjak  wrote:
>> On Thu, Jul 28, 2011 at 5:11 AM, H.J. Lu  wrote:
>>
>>> In x32, thread pointer is 32bit and choice of segment register for the
>>> thread base ptr load should be based on TARGET_64BIT.  This patch
>>> implements it.  OK for trunk?
>>
>> -ENOTESTCASE.
>>
>
> There is no standalone testcase.  The symptom is in glibc build, I
> got
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path 
> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
> Segmentation fault
> make[5]: *** Waiting for unfinished jobs
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
> Segmentation fault
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
> Segmentation fault
>
> since thread pointer is 32bit in x32.
>

If we load thread pointer (fs segment register) in x32 with 64bit
load, the upper 32bits are garbage.
We must load 32bit

-- 
H.J.


Re: Support -mcpu=native on Tru64 UNIX

2011-07-28 Thread Rainer Orth
Richard Henderson  writes:

> On 07/27/2011 04:57 AM, Rainer Orth wrote:
>> The following patch does so for -mcpu=native/-mtune=native on Tru64
>> UNIX, using getsysinfo(2).  A non-bootstrap C-only build is currently
>> running, the options above work as expected.
>
> I hadn't realized that the =native detection wasn't being done
> via __builtin_implver and __builtin_amask.  Seems to me that
> we should just use that and eliminate all the OS-specific stuff.

I wasn't aware of that, especially given that similar instructions on
MIPS and SPARC are privileged.  The following code has been tested in a
minimal program and seems to work fine, a full alpha-dec-osf5.1b
bootstrap is currently running.

Ok for mainline if that passes?

Thanks.
Rainer


2011-07-26  Rainer Orth  

* config/alpha/driver-alpha.c (IMPLVER_EV4_FAMILY,
IMPLVER_EV5_FAMILY, IMPLVER_EV6_FAMILY, IMPLVER_EV7_FAMILY): Define.
(AMASK_BWX, AMASK_FIX, AMASK_CIX, AMASK_MVI, AMASK_PRECISE,
AMASK_LOCKPFTCHOK): Define.
(host_detect_local_cpu): Remove buf, f, cpu_names.
Define cpu_types, implver, amask.
Use __builtin_alpha_implver, __builtin_alpha_amask to determine
native CPU.
* config.host: Also use driver-alpha.o, alpha/x-alpha on
alpha*-dec-osf*.
* config/alpha/osf5.h [__alpha__ || __alpha]
(host_detect_local_cpu): Declare.
(EXTRA_SPEC_FUNCTIONS, MCPU_MTUNE_NATIVE_SPECS)
(DRIVER_SELF_SPECS): Define.

diff --git a/gcc/config.host b/gcc/config.host
--- a/gcc/config.host
+++ b/gcc/config.host
@@ -100,9 +100,9 @@ case ${host} in
 esac
 
 case ${host} in
-  alpha*-*-linux*)
+  alpha*-*-linux* | alpha*-dec-osf*)
 case ${target} in
-  alpha*-*-linux*)
+  alpha*-*-linux* | alpha*-dec-osf*)
host_extra_gcc_objs="driver-alpha.o"
host_xmake_file="${host_xmake_file} alpha/x-alpha"
;;
diff --git a/gcc/config/alpha/driver-alpha.c b/gcc/config/alpha/driver-alpha.c
--- a/gcc/config/alpha/driver-alpha.c
+++ b/gcc/config/alpha/driver-alpha.c
@@ -1,5 +1,5 @@
 /* Subroutines for the gcc driver.
-   Copyright (C) 2009 Free Software Foundation, Inc.
+   Copyright (C) 2009, 2011 Free Software Foundation, Inc.
Contributed by Arthur Loiret 
 
 This file is part of GCC.
@@ -23,6 +23,22 @@ along with GCC; see the file COPYING3.  
 #include "coretypes.h"
 #include "tm.h"
 
+/* Chip family type IDs, returned by implver instruction.  */
+#define IMPLVER_EV4_FAMILY 0   /* LCA/EV4/EV45 */
+#define IMPLVER_EV5_FAMILY 1   /* EV5/EV56/PCA56 */
+#define IMPLVER_EV6_FAMILY 2   /* EV6 */
+#define IMPLVER_EV7_FAMILY 3   /* EV7 */
+
+/* Bit defines for amask instruction.  */
+#define AMASK_BWX  0x1  /* byte/word extension.  */
+#define AMASK_FIX  0x2  /* sqrt and f <-> i conversions 
+  extension.  */
+#define AMASK_CIX  0x4  /* count extension.  */
+#define AMASK_MVI  0x100/* multimedia extension.  */
+#define AMASK_PRECISE  0x200/* Precise arithmetic traps.  */
+#define AMASK_LOCKPFTCHOK  0x1000   /* Safe to prefetch lock cache
+  block.  */
+
 /* This will be called by the spec parser in gcc.c when it sees
a %:local_cpu_detect(args) construct.  Currently it will be called
with either "cpu" or "tune" as argument depending on if -mcpu=native
@@ -39,34 +55,22 @@ along with GCC; see the file COPYING3.  
 const char *
 host_detect_local_cpu (int argc, const char **argv)
 {
-  const char *cpu = NULL;
-  char buf[128];
-  FILE *f;
-
-  static const struct cpu_names {
-   const char *const name;
-   const char *const cpu;
-  } cpu_names[] = {
-{ "EV79",  "ev67" },
-{ "EV7",   "ev67" },
-{ "EV69",  "ev67" },
-{ "EV68CX","ev67" },
-{ "EV68CB","ev67" },
-{ "EV68AL","ev67" },
-{ "EV67",  "ev67" },
-{ "EV6",   "ev6" },
-{ "PCA57", "pca56" },
-{ "PCA56", "pca56" },
-{ "EV56",  "ev56" },
-{ "EV5",   "ev5" },
-{ "LCA45", "ev45" },
-{ "EV45",  "ev45" },
-{ "LCA4",  "ev4" },
-{ "EV4",   "ev4" },
-/*  { "EV3",   "ev3" },  */
-{ 0, 0 }
+  static const struct cpu_types {
+long implver;
+long amask;
+const char *const cpu;
+  } cpu_types[] = {
+{ IMPLVER_EV7_FAMILY, AMASK_BWX|AMASK_MVI|AMASK_FIX|AMASK_CIX, "ev67" },
+{ IMPLVER_EV6_FAMILY, AMASK_BWX|AMASK_MVI|AMASK_FIX|AMASK_CIX, "ev67" },
+{ IMPLVER_EV6_FAMILY, AMASK_BWX|AMASK_MVI|AMASK_FIX, "ev6" },
+{ IMPLVER_EV5_FAMILY, AMASK_BWX, "ev56" },
+{ IMPLVER_EV5_FAMILY, 0, "ev5" },
+{ IMPLVER_EV4_FAMILY, 0, "ev4" },
+{ 0, 0, NULL }
   };
-
+  long implver;
+  long amask;
+  const char *cpu;
   int i;
 
   if (argc < 1)
@@ -75,24 +79,18 @@ host_detect_local_cpu (int argc, const c
   if (strcmp (argv[0], "cpu") && strcmp (argv[0], "tune"))
 retur

Re: Allow IRIX Ada bootstrap with C++

2011-07-28 Thread Rainer Orth
Eric Botcazou  writes:

>> That's what I did in my last patch, but without SA_SIGINFO set.  This
>> doesn't work since the additional args passed in the sa_handler case are
>> not in any prototype, to g++ rightly complains (and this is an
>> implementation detail I'd not rely upon if it can be avoided).
>
> OK, I see, so there is a single prototype for the 2 variants with 3 args.

Right.  Even if I can cope with that, I haven't been able to extract all
the required info (pc, gregs, fpregs) from ucontext_t/mcontext_t with
SA_SIGINFO set.  Besides, it doesn't seem possible to distinguish
between the two cases (sa_handler/sa_sigaction).  Therefore I went back
for the following hack, adding comments to explain why it is necessary.

Bootstrapped without regressions on mips-sgi-irix6.5, all signal
handling failures introduced by my previous patch are gone again.

Ok for mainline?

Rainer


2011-07-26  Rainer Orth  

* init.c (__gnat_error_handler): Cast reason to int.
(__gnat_install_handler): Explain sa_sigaction use.

diff --git a/gcc/ada/init.c b/gcc/ada/init.c
--- a/gcc/ada/init.c
+++ b/gcc/ada/init.c
@@ -787,7 +787,11 @@ extern struct Exception_Data _abort_sign
 static void
 __gnat_error_handler (int sig, siginfo_t *reason, void *uc ATTRIBUTE_UNUSED)
 {
-  int code = reason == NULL ? 0 : reason->si_code;
+  /* This handler is installed with SA_SIGINFO cleared, but there's no
+ prototype for the resulting alternative three-argument form, so we
+ have to hack around this by casting reason to the int actually
+ passed.  */
+  int code = (int) reason;
   struct Exception_Data *exception;
   const char *msg;
 
@@ -872,7 +876,11 @@ __gnat_install_handler (void)
 
   /* Setup signal handler to map synchronous signals to appropriate
  exceptions.  Make sure that the handler isn't interrupted by another
- signal that might cause a scheduling event!  */
+ signal that might cause a scheduling event!
+
+ The handler is installed with SA_SIGINFO cleared, but there's no
+ C++ prototype for the three-argument form, so fake it by using
+ sa_sigaction and casting the arguments instead.  */
 
   act.sa_sigaction = __gnat_error_handler;
   act.sa_flags = SA_NODEFER + SA_RESTART;

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Allow IRIX Ada bootstrap with C++

2011-07-28 Thread Arnaud Charlet
> Bootstrapped without regressions on mips-sgi-irix6.5, all signal
> handling failures introduced by my previous patch are gone again.
> 
> Ok for mainline?

OK.


Re: PATCH: PR target/47715: [x32] TLS doesn't work

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 12:44 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:52 AM, Uros Bizjak  wrote:
>
>>> TLS on X32 is almost identical to TLS on x86-64.  The only difference is
>>> x32 address space is 32bit.  That means TLS symbols can be in either
>>> SImode or DImode with upper 32bit zero.  This patch updates
>>> tls_global_dynamic_64 to support x32.  OK for trunk?
>
> Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will
> also work.  Please see attached patch.
>

Yes, it works.  Can you apply it?

Thanks.


-- 
H.J.


Re: PATCH: PR target/47715: [x32] TLS doesn't work

2011-07-28 Thread H.J. Lu
On Wed, Jul 27, 2011 at 11:52 PM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 4:55 AM, H.J. Lu  wrote:
>> TLS on X32 is almost identical to TLS on x86-64.  The only difference is
>> x32 address space is 32bit.  That means TLS symbols can be in either
>> SImode or DImode with upper 32bit zero.  This patch updates
>> tls_global_dynamic_64 to support x32.  OK for trunk?
>
>> 2011-07-27  H.J. Lu  
>>
>>        PR target/47715
>>        * config/i386/i386.md (PTR64): New.
>>        (*tls_global_dynamic_64): Rename to ...
>>        (*tls_global_dynamic_64_): This.  Put PTR64 on operand 1.
>>        (tls_global_dynamic_64): Rename to ...
>>        (tls_global_dynamic_64_): This.  Put PTR64 on operand 1.
>>        * config/i386/i386.c (legitimize_tls_address): Updated.
>
> Just remove mode check, so:
>
> (unspec:DI [(match_operand 1 "tls_symbolic_operand" "")]
>
> at both sites.
>
> -  fputs (ASM_BYTE "0x66\n", asm_out_file);
> +  if (!TARGET_X32)
> +    fputs (ASM_BYTE "0x66\n", asm_out_file);
>
> Are you sure? There are some scary comments in binutils that these
> sequences have to be written _exactly_ as shown to enable certain
> linker relaxations w.r.t. TLS relocs.

That is true.  I updated those scary comments in binutils for x32 :-).
Since x32 is 32bit, we can use 32bit move instructions instead 64bit.
We don't need REX prefix in x32. Now those scary comments read:

 /* Check transition from GD access model.  For 64bit, only
.byte 0x66; leaq foo@tlsgd(%rip), %rdi
.word 0x; rex64; call __tls_get_addr
 can transit to different access model.  For 32bit, only
leaq foo@tlsgd(%rip), %rdi
.word 0x; rex64; call __tls_get_addr
 can transit to different access model.  */

The difference is one less 0x66 byte.


-- 
H.J.


Re: Allow IRIX Ada bootstrap with C++

2011-07-28 Thread Andreas Schwab
Rainer Orth  writes:

> diff --git a/gcc/ada/init.c b/gcc/ada/init.c
> --- a/gcc/ada/init.c
> +++ b/gcc/ada/init.c
> @@ -787,7 +787,11 @@ extern struct Exception_Data _abort_sign
>  static void
>  __gnat_error_handler (int sig, siginfo_t *reason, void *uc ATTRIBUTE_UNUSED)
>  {
> -  int code = reason == NULL ? 0 : reason->si_code;
> +  /* This handler is installed with SA_SIGINFO cleared, but there's no
> + prototype for the resulting alternative three-argument form, so we
> + have to hack around this by casting reason to the int actually
> + passed.  */
> +  int code = (int) reason;
>struct Exception_Data *exception;
>const char *msg;
>  
> @@ -872,7 +876,11 @@ __gnat_install_handler (void)
>  
>/* Setup signal handler to map synchronous signals to appropriate
>   exceptions.  Make sure that the handler isn't interrupted by another
> - signal that might cause a scheduling event!  */
> + signal that might cause a scheduling event!
> +
> + The handler is installed with SA_SIGINFO cleared, but there's no
> + C++ prototype for the three-argument form, so fake it by using
> + sa_sigaction and casting the arguments instead.  */
>  
>act.sa_sigaction = __gnat_error_handler;
>act.sa_flags = SA_NODEFER + SA_RESTART;

Wouldn't it be cleanest to adjust the prototype of __gnat_error_handler
to reality, and cast it when assigning to sa_handler (not sa_sigaction,
which is only valid if SA_SIGINFO is set)?

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."


Re: Support -mcpu=native on Tru64 UNIX

2011-07-28 Thread Richard Henderson
On 07/28/2011 06:29 AM, Rainer Orth wrote:
> +{ IMPLVER_EV6_FAMILY, AMASK_BWX|AMASK_MVI|AMASK_FIX, "ev6" },
> +{ IMPLVER_EV5_FAMILY, AMASK_BWX, "ev56" },

In between is the pca56 with BWX+MVI.

Otherwise ok.


r~


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 6:40 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 3:24 PM, H.J. Lu  wrote:
>
> In x32, thread pointer is 32bit and choice of segment register for the
> thread base ptr load should be based on TARGET_64BIT.  This patch
> implements it.  OK for trunk?

 -ENOTESTCASE.

>>>
>>> There is no standalone testcase.  The symptom is in glibc build, I
>>> got
>>>
>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>  -E -x c-header'
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>> --library-path 
>>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>> ../scripts -h rpcsvc/yppasswd.x -o
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>>> Segmentation fault
>>> make[5]: *** Waiting for unfinished jobs
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>>> Segmentation fault
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>>> Segmentation fault
>>>
>>> since thread pointer is 32bit in x32.
>>>
>>
>> If we load thread pointer (fs segment register) in x32 with 64bit
>> load, the upper 32bits are garbage.
>> We must load 32bit
>
> So, instead of huge complications with new mode iterator, just
> introduce two new patterns that will shadow existing ones for
> TARGET_X32.
>
> Like in attached (untested) patch.
>

I tried the following patch with typos fixed.  It almost worked,
except for this failure in glibc testsuite:

gen-locale.sh: line 27: 14755 Aborted (core dumped)
I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
-c -f $charmap -i $input ${common_objpfx}localedata/$out
Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" failed
make[4]: *** 
[/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
Error 1

I will add:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8723dc5..d32d64d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
 {
   rtx tp, reg, insn;

-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  if (ptr_mode != Pmode)
+tp = convert_to_mode (Pmode, tp, 1);
   if (!to_reg)
 return tp;

since TP must be 32bit.

Thanks.

-- 
H.J.
---
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index aaaf53a..a194ffb 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12452,6 +12452,17 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])

 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
@@ -12463,6 +12474,19 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])

+(define_insn "*add_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (plus:DI (unspec:DI [(const_int 0)] UNSPEC_TP)
+(match_operand:DI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)


Re: PATCH: PR target/47715: [x32] TLS doesn't work

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 7:42 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 3:47 PM, H.J. Lu  wrote:
>
> TLS on X32 is almost identical to TLS on x86-64.  The only difference is
> x32 address space is 32bit.  That means TLS symbols can be in either
> SImode or DImode with upper 32bit zero.  This patch updates
> tls_global_dynamic_64 to support x32.  OK for trunk?
>>>
>>> Please also change 64bit GNU2_TLS patterns, so -mtls-dialect=gnu2 will
>>> also work.  Please see attached patch.
>>>
>>
>> Yes, it works.  Can you apply it?
>
> This is what I have committed:
>
> 2011-07-28  Uros Bizjak  
>
>        PR target/47715
>        * config/i386/i386.md (*tls_global_dynamic_64): Remove mode from
>        tls_symbolic_operand check.  Update code sequence for TARGET_X32.
>        (tls_global_dynamic_64): Remove mode from tls_symbolic_operand check.
>        (tls_dynamic_gnu2_64): Ditto.
>        (*tls_dynamic_gnu2_lea_64): Ditto.
>        (*tls_dynamic_gnu2_call_64): Ditto.
>        (*tls_dynamic_gnu2_combine_64): Ditto.
>

It looks good.  I will check in

@@ -12341,15 +12345,16 @@
   return "call\t%P2";
 }
   [(set_attr "type" "multi")
-   (set_attr "length" "16")])
+   (set (attr "length")
+   (symbol_ref "TARGET_X32 ? 15 : 16"))])

since x32 is one byte shorter now.

Thanks.

-- 
H.J.


[PATCH, ARM] Fix broken testcase, vfp-1.c, for Thumb

2011-07-28 Thread Ian Bolton
This patch makes the vfp-1.c testcase work for Thumb.  It became broken when
we
restricted the negative offsets allowed for Thumb to fix up a Spec2K failure
some months back.  (It was previously possible to generate illegal offsets.)

OK for trunk?


Cheers,
Ian


2011-07-28  Ian Bolton  

testsuite/
* gcc.target/arm/vfp-1.c: large negative offsets not possible on
Thumb2.



Index: gcc/testsuite/gcc.target/arm/vfp-1.c
===
--- gcc/testsuite/gcc.target/arm/vfp-1.c(revision 176838)
+++ gcc/testsuite/gcc.target/arm/vfp-1.c(working copy)
@@ -127,13 +127,13 @@ void test_convert () {
 
 void test_ldst (float f[], double d[]) {
   /* { dg-final { scan-assembler "flds.+ \\\[r0, #1020\\\]" } } */
-  /* { dg-final { scan-assembler "flds.+ \\\[r0, #-1020\\\]" } } */
+  /* { dg-final { scan-assembler "flds.+ \\\[r\[0-9\], #-1020\\\]" { target
{ arm32 && { ! arm_thumb2_ok } } } } } */
   /* { dg-final { scan-assembler "add.+ r0, #1024" } } */
-  /* { dg-final { scan-assembler "fsts.+ \\\[r0, #0\\\]\n" } } */
+  /* { dg-final { scan-assembler "fsts.+ \\\[r\[0-9\], #0\\\]\n" } } */
   f[256] = f[255] + f[-255];
 
   /* { dg-final { scan-assembler "fldd.+ \\\[r1, #1016\\\]" } } */
-  /* { dg-final { scan-assembler "fldd.+ \\\[r1, #-1016\\\]" } } */
+  /* { dg-final { scan-assembler "fldd.+ \\\[r\[1-9\], #-1016\\\]" { target
{ arm32 && { ! arm_thumb2_ok } } } } } */
   /* { dg-final { scan-assembler "fstd.+ \\\[r1, #256\\\]" } } */
   d[32] = d[127] + d[-127];
 }





Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 6:40 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 3:24 PM, H.J. Lu  wrote:
>
> In x32, thread pointer is 32bit and choice of segment register for the
> thread base ptr load should be based on TARGET_64BIT.  This patch
> implements it.  OK for trunk?

 -ENOTESTCASE.

>>>
>>> There is no standalone testcase.  The symptom is in glibc build, I
>>> got
>>>
>>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>>  -E -x c-header'
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>>> --library-path 
>>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>>> ../scripts -h rpcsvc/yppasswd.x -o
>>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>>> Segmentation fault
>>> make[5]: *** Waiting for unfinished jobs
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>>> Segmentation fault
>>> make[5]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>>> Segmentation fault
>>>
>>> since thread pointer is 32bit in x32.
>>>
>>
>> If we load thread pointer (fs segment register) in x32 with 64bit
>> load, the upper 32bits are garbage.
>> We must load 32bit
>
> So, instead of huge complications with new mode iterator, just
> introduce two new patterns that will shadow existing ones for
> TARGET_X32.
>
> Like in attached (untested) patch.
>

We can't just shadow them. They have to be disabled for x32.
I am testing this.


-- 
H.J.
---
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index aaaf53a..9191b98 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12452,10 +12452,21 @@
 (define_mode_attr tp_seg [(SI "gs") (DI "fs")])

 ;; Load and add the thread base pointer from %:0.
+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*load_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec:P [(const_int 0)] UNSPEC_TP))]
-  ""
+  "!TARGET_X32"
   "mov{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "imov")
(set_attr "modrm" "0")
@@ -12463,12 +12474,25 @@
(set_attr "memory" "load")
(set_attr "imm_disp" "false")])

+(define_insn "*add_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (plus:DI (unspec:DI [(const_int 0)] UNSPEC_TP)
+(match_operand:DI 1 "register_operand" "0")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_X32"
+  "add{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "alu")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])
+
 (define_insn "*add_tp_"
   [(set (match_operand:P 0 "register_operand" "=r")
(plus:P (unspec:P [(const_int 0)] UNSPEC_TP)
(match_operand:P 1 "register_operand" "0")))
(clobber (reg:CC FLAGS_REG))]
-  ""
+  "!TARGET_X32"
   "add{}\t{%%:0, %0|%0,  PTR :0}"
   [(set_attr "type" "alu")
(set_attr "modrm" "0")


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:

>> In x32, thread pointer is 32bit and choice of segment register for the
>> thread base ptr load should be based on TARGET_64BIT.  This patch
>> implements it.  OK for trunk?
>
> -ENOTESTCASE.
>

 There is no standalone testcase.  The symptom is in glibc build, I
 got

 CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
  -E -x c-header'
 /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
 --library-path 
 /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
 ../scripts -h rpcsvc/yppasswd.x -o
 /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
 Segmentation fault
 make[5]: *** Waiting for unfinished jobs
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
 Segmentation fault
 make[5]: *** 
 [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
 Segmentation fault

 since thread pointer is 32bit in x32.

>>>
>>> If we load thread pointer (fs segment register) in x32 with 64bit
>>> load, the upper 32bits are garbage.
>>> We must load 32bit
>>
>> So, instead of huge complications with new mode iterator, just
>> introduce two new patterns that will shadow existing ones for
>> TARGET_X32.
>>
>> Like in attached (untested) patch.
>>
>
> I tried the following patch with typos fixed.  It almost worked,
> except for this failure in glibc testsuite:
>
> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
> -c -f $charmap -i $input ${common_objpfx}localedata/$out
> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" failed
> make[4]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
> Error 1
>
> I will add:
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8723dc5..d32d64d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>  {
>   rtx tp, reg, insn;
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>   if (!to_reg)
>     return tp;
>
> since TP must be 32bit.

No, this won't have the desired effect. It will change the UNSPEC, so
it won't match patterns in i386.md.

Can you debug the failure a bit more? With my patterns, add{l} and
mov{l} should clear top 32bits.

I'd also argue that there is something wrong with glibc. It should
initialize %fs with a zero-extended value, so original 64bit
load_tp/add_tp patterns could be used.

Uros.


Re: [PATCH] Fix PR49876: Continue code generation with integer_zero_node on gloog_error

2011-07-28 Thread Sebastian Pop
On Thu, Jul 28, 2011 at 09:49, Richard Guenther  wrote:
> And it's always integers or pointers only?  Otherwise you'd probably
> want build_zero_cst (type) instead.

Ok, I started regstrapping again with build_zero_cst.

Thanks,
Sebastian


Re: Unreviewed libgcc patches

2011-07-28 Thread Rainer Orth
NightStrike  writes:

> On Mon, Jul 18, 2011 at 8:21 AM, Rainer Orth
>  wrote:
>> The following two libgcc patches have seen almost no comments, and
>> certainly neither testing or review in a week:
>>
>>        CFT: [build] Move fp-bit support to toplevel libgcc
>>        http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00927.html
>>
>>        CFT: [build] Move soft-fp support to toplevel libgcc
>>        http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00931.html
>>
>> This patch will need to be updated for the recent addition of the c6x
>> port.
>>
>> Both will probably need build and libgcc maintainers and either a bunch
>> of target maintainers or a global reviewer.  I wonder how to proceed
>> here: I've got a bunch of further libgcc patches in the works or
>> planned, but if I can't get them reviewed, there's no point in
>> continuing that work.
>
> Do you still need support?

No: I've since received review comments for both and am working to
incorporate them and complete the remaining libgcc build-related
patches.

What would help is target maintainers actually reviewing or trying the
patches.  I'll probably wait until I've (re-)posted the full set and
start a new CFT then.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Thu, 28 Jul 2011, Tom de Vries wrote:

> On 07/28/2011 12:22 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> > 
> >> On 07/27/2011 05:27 PM, Richard Guenther wrote:
> >>> On Wed, 27 Jul 2011, Tom de Vries wrote:
> >>>
>  On 07/27/2011 02:12 PM, Richard Guenther wrote:
> > On Wed, 27 Jul 2011, Tom de Vries wrote:
> >
> >> On 07/27/2011 01:50 PM, Tom de Vries wrote:
> >>> Hi Richard,
> >>>
> >>> I have a patch set for bug 43513 - The stack pointer is adjusted 
> >>> twice.
> >>>
> >>> 01_pr43513.3.patch
> >>> 02_pr43513.3.test.patch
> >>> 03_pr43513.3.mudflap.patch
> >>>
> >>> The patch set has been bootstrapped and reg-tested on x86_64.
> >>>
> >>> I will sent out the patches individually.
> >>>
> >>
> >> The patch replaces a vla __builtin_alloca that has a constant argument 
> >> with an
> >> array declaration.
> >>
> >> OK for trunk?
> >
> > I don't think it is safe to try to get at the VLA type the way you do.
> 
>  I don't understand in what way it's not safe. Do you mean I don't manage 
>  to find
>  the type always, or that I find the wrong type, or something else?
> >>>
> >>> I think you might get the wrong type,
> >>
> >> Ok, I'll review that code one more time.
> >>
> >>> you also do not transform code
> >>> like
> >>>
> >>>   int *p = alloca(4);
> >>>   *p = 3;
> >>>
> >>> as there is no array type involved here.
> >>>
> >>
> >> I was trying to stay away from non-vla allocas.  A source declared alloca 
> >> has
> >> function livetime, so we could have a single alloca in a loop, called 10 
> >> times,
> >> with all 10 instances live at the same time. This patch does not detect 
> >> such
> >> cases, and thus stays away from non-vla allocas. A vla decl does not have 
> >> such
> >> problems, the lifetime ends when it goes out of scope.
> > 
> > Yes indeed - that probably would require more detailed analysis.
> > 
> > In fact I would simply do sth like
> >
> >   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
> >   n_elem = size * 8 / BITS_PER_UNIT;
> >   array_type = build_array_type_nelts (elem_type, n_elem);
> >   var = create_tmp_var (array_type, NULL);
> >   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
> >
> 
>  I tried this code on the example, and it works, but the newly declared 
>  type has
>  an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>  This make
>  the memory access in the example potentially unaligned, which prohibits 
>  an
>  ivopts optimization, so the resulting text size is 68 instead of the 64 
>  achieved
>  with my current patch.
> >>>
> >>> Ok, so then set DECL_ALIGN of the variable to something reasonable
> >>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> >>> alignment that the targets alloca function would guarantee.
> >>>
> >>
> >> I tried that, but that doesn't help. It's the alignment of the type that
> >> matters, not of the decl.
> > 
> > It shouldn't.  All accesses are performed with the original types and
> > alignment comes from that (plus the underlying decl).
> > 
> 
> I managed to get it all working by using build_aligned_type rather that 
> DECL_ALIGN.

That's really odd, DECL_ALIGN should just work - nothing refers to the
type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
1 maybe?

> 
> >> So should we try to find the base type of the vla, and use that, or use the
> >> nonstandard char type?
> > 
> > I don't think we can reliably find the base type of the vla - well,
> > in practice we may because we control how we lower VLAs during
> > gimplification, but nothing in the IL constraints say that the
> > resulting pointer type should be special.
> > 
> > Using a char[] decl shouldn't be a problem IMHO.
> > 
> > And obviously you lose the optimization we arrange with inserting
> > __builtin_stack_save/restore pairs that way - stack space will no
> > longer be shared for subsequent VLAs.  Which means that you'd
> > better limit the size you allow this promotion.
> >
> 
>  Right, I could introduce a parameter for this.
> >>>
> >>> I would think you could use PARAM_LARGE_STACK_FRAME for now and say,
> >>> allow a size of PARAM_LARGE_STACK_FRAME / 10?
> >>>
> >>
> >> That unfortunately is too small for the example from bug report. The 
> >> default
> >> value of the param is 250, so that would be a threshold of 25, and the 
> >> alloca
> >> size of the example is 40.  Perhaps we can try a threshold of
> >> PARAM_LARGE_STACK_FRAME - estimated_stack_size or some such?
> > 
> > Hm.  estimated_stack_size is not O(1), so no.  I think we need to
> > find a sensible way of allowing stack sharing.  Eventually Michas
> > patch for introducing points-of-death would help here, if we'd
> > go for folding this during stack-save/resto

Re: [Patch,AVR]: PR49687 (better widening 32-bit mul)

2011-07-28 Thread Richard Henderson
On 07/28/2011 07:59 AM, Georg-Johann Lay wrote:
> So it appears that IRA is not as smart as we thought and not
> prepared for this...
> 
> Or did I do something fundamentally wrong?

It sure doesn't look like you've done anything wrong.


r~


Re: [PATCH] Fix -gdwarf-3 DW_AT_data_member_location for >= 64KB offsets (PR debug/49871)

2011-07-28 Thread Jason Merrill

OK.

Jason


Re: [PATCH] Disable size optimizations of -gdwarf-2 DW_AT_data_member_location DW_OP_plus_uconst

2011-07-28 Thread Jason Merrill

I'd find the logic easier to read if it were inverted, but OK.

Jason


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 7:59 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:
>
>>> In x32, thread pointer is 32bit and choice of segment register for the
>>> thread base ptr load should be based on TARGET_64BIT.  This patch
>>> implements it.  OK for trunk?
>>
>> -ENOTESTCASE.
>>
>
> There is no standalone testcase.  The symptom is in glibc build, I
> got
>
> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>  -E -x c-header'
> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
> --library-path 
> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
> ../scripts -h rpcsvc/yppasswd.x -o
> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
> Segmentation fault
> make[5]: *** Waiting for unfinished jobs
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
> Segmentation fault
> make[5]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
> Segmentation fault
>
> since thread pointer is 32bit in x32.
>

 If we load thread pointer (fs segment register) in x32 with 64bit
 load, the upper 32bits are garbage.
 We must load 32bit
>>>
>>> So, instead of huge complications with new mode iterator, just
>>> introduce two new patterns that will shadow existing ones for
>>> TARGET_X32.
>>>
>>> Like in attached (untested) patch.
>>>
>>
>> I tried the following patch with typos fixed.  It almost worked,
>> except for this failure in glibc testsuite:
>>
>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" failed
>> make[4]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>> Error 1
>>
>> I will add:
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 8723dc5..d32d64d 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>  {
>>   rtx tp, reg, insn;
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>   if (!to_reg)
>>     return tp;
>>
>> since TP must be 32bit.
>
> No, this won't have the desired effect. It will change the UNSPEC, so
> it won't match patterns in i386.md.
>
> Can you debug the failure a bit more? With my patterns, add{l} and
> mov{l} should clear top 32bits.
>

TP is 32bit in x32  For load_tp_x32, we load SImode value and
zero-extend to DImode. For add_tp_x32, we are adding SImode
value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
must take SImode TP.

-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 8:59 AM, H.J. Lu  wrote:
> On Thu, Jul 28, 2011 at 7:59 AM, Uros Bizjak  wrote:
>> On Thu, Jul 28, 2011 at 4:47 PM, H.J. Lu  wrote:
>>
 In x32, thread pointer is 32bit and choice of segment register for the
 thread base ptr load should be based on TARGET_64BIT.  This patch
 implements it.  OK for trunk?
>>>
>>> -ENOTESTCASE.
>>>
>>
>> There is no standalone testcase.  The symptom is in glibc build, I
>> got
>>
>> CPP='/export/build/gnu/gcc-x32/release/usr/gcc-4.7.0-x32/bin/gcc -mx32
>>  -E -x c-header'
>> /export/build/gnu/glibc-x32/build-x86_64-linux/elf/ld-linux-x32.so.2
>> --library-path 
>> /export/build/gnu/glibc-x32/build-x86_64-linux:/export/build/gnu/glibc-x32/build-x86_64-linux/math:/export/build/gnu/glibc-x32/build-x86_64-linux/elf:/export/build/gnu/glibc-x32/build-x86_64-linux/dlfcn:/export/build/gnu/glibc-x32/build-x86_64-linux/nss:/export/build/gnu/glibc-x32/build-x86_64-linux/nis:/export/build/gnu/glibc-x32/build-x86_64-linux/rt:/export/build/gnu/glibc-x32/build-x86_64-linux/resolv:/export/build/gnu/glibc-x32/build-x86_64-linux/crypt:/export/build/gnu/glibc-x32/build-x86_64-linux/nptl
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcgen -Y
>> ../scripts -h rpcsvc/yppasswd.x -o
>> /export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/rpcsvc/yppasswd.T
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xbootparam_prot.stmp]
>> Segmentation fault
>> make[5]: *** Waiting for unfinished jobs
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xrstat.stmp]
>> Segmentation fault
>> make[5]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/sunrpc/xyppasswd.stmp]
>> Segmentation fault
>>
>> since thread pointer is 32bit in x32.
>>
>
> If we load thread pointer (fs segment register) in x32 with 64bit
> load, the upper 32bits are garbage.
> We must load 32bit

 So, instead of huge complications with new mode iterator, just
 introduce two new patterns that will shadow existing ones for
 TARGET_X32.

 Like in attached (untested) patch.

>>>
>>> I tried the following patch with typos fixed.  It almost worked,
>>> except for this failure in glibc testsuite:
>>>
>>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
>>> failed
>>> make[4]: *** 
>>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>>> Error 1
>>>
>>> I will add:
>>>
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index 8723dc5..d32d64d 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>>  {
>>>   rtx tp, reg, insn;
>>>
>>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>>> +  if (ptr_mode != Pmode)
>>> +    tp = convert_to_mode (Pmode, tp, 1);
>>>   if (!to_reg)
>>>     return tp;
>>>
>>> since TP must be 32bit.
>>
>> No, this won't have the desired effect. It will change the UNSPEC, so
>> it won't match patterns in i386.md.
>>
>> Can you debug the failure a bit more? With my patterns, add{l} and
>> mov{l} should clear top 32bits.
>>
>
> TP is 32bit in x32  For load_tp_x32, we load SImode value and
> zero-extend to DImode. For add_tp_x32, we are adding SImode
> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
> must take SImode TP.
>

I will see what I can do.


-- 
H.J.


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Richard Henderson
> New erase method and placement new for aatree.
> 
> * aatree.h (aa_tree::remove): New.
> (aa_tree::operator new): Add placement new.

Ok.

> Change pr_hasElse to the value specified in the ABI.
> 
> * libitm.h (_ITM_codeProperties): Change pr_hasElse to the ABI's 
> value.

Ok.

> Add information to dispatch about closed nesting and uninstrumented code.
> 
> * dispatch.h (GTM::abi_dispatch): Add can_run_uninstrumented_code 
> and
> closed_nesting flags, as well as a closed nesting alternative.
> * method-serial.cc: Same.

Nearly...

> +  virtual abi_dispatch* closed_nesting_alternative()
> +  {
> +// For nested transactions with an instrumented code path, we can do
> +// undo logging.
> +return GTM::dispatch_serial();

Surely you really mean dispatch_serial_ul here?
Otherwise ok.

> Use vector instead of list to store user actions.
> 
>   * useraction.cc: Use vector instead of list to store actions.
>   Also support partial rollbacks for closed nesting.
>   * libitm_i.h (GTM::gtm_transaction::user_action): Same.
>   * beginend.cc: Same.

Ok.

> Add closed nesting as restart reason.
> 
>   * libitm_i.h: Add closed nesting as restart reason.
>   * retry.cc (GTM::gtm_transaction::decide_retry_strategy): Same.

Ok, except

> +  if (r == RESTART_CLOSED_NESTING) retry_serial = true;

Coding style.  THEN statement on the next line, even for small THEN.

> Make flat nesting the default, use closed nesting on demand.
> 
> * local.cc (gtm_transaction::rollback_local): Support closed 
> nesting.
> * eh_cpp.cc (GTM::gtm_transaction::revert_cpp_exceptions): Same.
>   * dispatch.h: Same.
>   * method-serial.cc: Same.
> * beginend.cc (GTM::gtm_transaction::begin_transaction): Change to
> flat nesting as default, and closed nesting on demand.
> (GTM::gtm_transaction::rollback): Same.
> (_ITM_abortTransaction): Same.
> (GTM::gtm_transaction::restart): Same.
> (GTM::gtm_transaction::trycommit): Same.
> (GTM::gtm_transaction::trycommit_and_finalize): Removed.
> (choose_code_path): New.
> (GTM::gtm_transaction_cp::save): New.
> (GTM::gtm_transaction_cp::commit): New.
> * query.cc (_ITM_inTransaction): Support flat nesting.
> * libitm_i.h (GTM::gtm_transaction_cp): New helper struct for 
> nesting.
> (GTM::gtm_transaction): Support flat and closed nesting.
> * alloc.cc (commit_allocations_2): New.
> (commit_cb_data): New helper struct.
> (GTM::gtm_transaction::commit_allocations): Handle nested
> commits/rollbacks.
> * libitm.texi: Update user action section, add description of 
> nesting.

Nearly...

> +  abi_dispatch *cn_disp = disp->closed_nesting_alternative();
> +  if (cn_disp)
> +{
> +  disp = cn_disp;
> +  set_abi_disp(disp);
> +}

Don't we need to fini the old disp?  Seems there's a leak here, though
not visible until we re-instate the non-serial methods.

> +  if (!(tx->state & STATE_IRREVOCABLE)) ret |= a_saveLiveVariables;

Coding style.



r~


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Richard Henderson
On 07/27/2011 03:35 AM, Torvald Riegel wrote:
> patch7: gtm_transaction::decide_begin_dispatch() gets the transaction
> properties from the caller instead of reading from
> gtm_transaction::prop, which might not have been updated by the caller
> yet.
> 
> patch8: Fix nesting level reset when rolling back the outermost
> transaction. Before, this was incorrectly reset to zero, which caused
> transactions to not get committed. This didn't show up in previous
> testing because previously, only serial-mode TM methods were available.
> 
> OK for branch, together with the previous patches?

Both ok.


r~


Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-28 Thread Tobias Grosser

On 07/26/2011 08:30 PM, Sebastian Pop wrote:

On Fri, Jul 22, 2011 at 07:32, Joseph S. Myers  wrote:

On Fri, 22 Jul 2011, Tobias Grosser wrote:


I propose to switch to the official cloog.org cloog version with isl backend and
at the same time to remove support for both CLooG-PPL legacy as well as
CLooG-Parma.


Where are the install.texi changes in this patch series?


Please see the attached patch.


Thanks Sebastian.

Can you take care of uploading cloog-0.16.3 to the gcc ftp site?

Cheers
Tobi


Re: [PATCH 0/3] Move Graphite to CLooG 0.16.3 with isl backend.

2011-07-28 Thread Tobias Grosser

On 07/27/2011 06:20 PM, Jack Howarth wrote:

On Fri, Jul 22, 2011 at 01:00:09AM +0200, Tobias Grosser wrote:

Hi,

I propose to switch to the official cloog.org cloog version with isl backend and
at the same time to remove support for both CLooG-PPL legacy as well as
CLooG-Parma.

We want to switch to cloog-isl as it is the only officially maintained version
of cloog. Furthermore, it provides features that will help to fix some bugs in
the graphite code generation[1].
The reason to abond CLooG-PPL (legacy version) is, that cloog-isl provides the
new CloogInput library interface. This interface is not available the old CLooG.
I plan to move graphite to this interface. As I do not see enough benefits from
being able to use CLooG PPL, I decided to not introduce any compatibility
scheme, but just remove any code that is only needed for CLooG-PPL.
I also removed CLooG-Parma (cloog.org with PPL backend), as it is currently not
actively maintained and not well tested. I believe our time is better spent on
improving graphite or cloog isl, as in putting time into this cloog version.

So here we are: Moving graphite back to the official cloog.org version!

Passes 'make check RUNTESTFLAGS=graphite.exp' as well as a bootstrap on Linux
amd64.

Cheers
Tobi


Tobi,
Are there any additional plans for gcc 4.7? In particular, wasn't the 
-fgraphite-identity
option supported to be enabled at -O3 by defaulting ftree-loop-linear on which 
is now an alias
of -floop-interchange since gcc 4.6?
   Jack


Hi Jack,

I personally do not have any fixed plans for something that needs to be 
in 4.7. Here is a list of open topics, that will be fixed if time allows 
(or someone funds my time):


- Remove dependency to PPL (as Richi mentioned)
- Integrate region based scop detection
- Use isl scheduling (Pluto like) to automatically tile the code
- Integrate Konrad's data flow patches (if he submits them)

Cheers
Tobi


Re: [PATCH 1/3] Fix PR47653: do not handle loops using wrapping semantics in graphite

2011-07-28 Thread Tobias Grosser

On 07/24/2011 08:25 AM, Sebastian Pop wrote:

2011-07-23  Sebastian Pop

PR middle-end/47653
* graphite-scop-detection.c (graphite_can_represent_loop): Discard
loops using wrapping semantics.

* gcc.dg/graphite/run-id-pr47653.c: New.
* gcc.dg/graphite/interchange-3.c: Do not use unsigned types for
induction variables.
* gcc.dg/graphite/scop-16.c: Same.
* gcc.dg/graphite/scop-17.c: Same.
* gcc.dg/graphite/scop-21.c: Same.
---
  gcc/ChangeLog  |6 ++
  gcc/graphite-scop-detection.c  |   18 +-
  gcc/testsuite/ChangeLog|   10 ++
  gcc/testsuite/gcc.dg/graphite/interchange-3.c  |2 +-
  gcc/testsuite/gcc.dg/graphite/run-id-pr47653.c |   17 +
  gcc/testsuite/gcc.dg/graphite/scop-16.c|2 +-
  gcc/testsuite/gcc.dg/graphite/scop-17.c|2 +-
  gcc/testsuite/gcc.dg/graphite/scop-21.c|2 +-
  .../testsuite/libgomp.graphite/force-parallel-1.c  |2 +-
  9 files changed, 47 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/run-id-pr47653.c


[PATCH, PR 49886] Prevent fnsplit from changing signature when there are type attributes

2011-07-28 Thread Martin Jambor
Hi,

pass_split_functions is happy to split functions which have type
attributes but cannot update them if the new clone has in any way
different parameters than the original.  This can lead to
miscompilations in cases like the testcase.

This patch solves it by 1) making the inliner set the
can_change_signature flag to false for them because their signature
cannot be changed (this step is also necessary to make IPA-CP operate
on them and handle them correctly), and 2) make the splitting pass
keep all parameters if the flag is set.  The second step might involve
inventing some default definitions if the parameters did not really
have any.

I spoke about this with Honza and he claimed that the new function is
really an entirely different thing and that the parameters may
correspond only very loosely and thus the type attributes should be
cleared.  I'm not sure I agree, but in any case I needed this to work
to allow me continue with promised IPA-CP polishing and so I decided
to do this because it was easier.  (My own opinion is that the current
representation of parameter-describing function type attributes is
evil and will cause harm no matter hat we do.)

A very similar patch has passed bootstrap and testsuite on
x86_64-linux, the current one is undergoing both right now.  OK for
trunk if it passes?

Thanks,

Martin



2011-07-28  Martin Jambor  

PR middle-end/49886
* ipa-inline-analysis.c (compute_inline_parameters): Set
can_change_signature of noes with typde attributes.
* ipa-split.c (split_function): Do not skip any arguments if
can_change_signature is set.

* testsuite/gcc.c-torture/execute/pr49886.c: New testcase.

Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -1658,18 +1658,24 @@ compute_inline_parameters (struct cgraph
   /* Can this function be inlined at all?  */
   info->inlinable = tree_inlinable_function_p (node->decl);
 
-  /* Inlinable functions always can change signature.  */
-  if (info->inlinable)
-node->local.can_change_signature = true;
+  /* Type attributes can use parameter indices to describe them.  */
+  if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
+node->local.can_change_signature = false;
   else
 {
-  /* Functions calling builtin_apply can not change signature.  */
-  for (e = node->callees; e; e = e->next_callee)
-   if (DECL_BUILT_IN (e->callee->decl)
-   && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
-   && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
- break;
-  node->local.can_change_signature = !e;
+  /* Otherwise, inlinable functions always can change signature.  */
+  if (info->inlinable)
+   node->local.can_change_signature = true;
+  else
+   {
+ /* Functions calling builtin_apply can not change signature.  */
+ for (e = node->callees; e; e = e->next_callee)
+   if (DECL_BUILT_IN (e->callee->decl)
+   && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
+   && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
+ break;
+ node->local.can_change_signature = !e;
+   }
 }
   estimate_function_body_sizes (node, early);
 
Index: src/gcc/ipa-split.c
===
--- src.orig/gcc/ipa-split.c
+++ src/gcc/ipa-split.c
@@ -945,10 +945,10 @@ static void
 split_function (struct split_point *split_point)
 {
   VEC (tree, heap) *args_to_pass = NULL;
-  bitmap args_to_skip = BITMAP_ALLOC (NULL);
+  bitmap args_to_skip;
   tree parm;
   int num = 0;
-  struct cgraph_node *node;
+  struct cgraph_node *node, *cur_node = cgraph_get_node 
(current_function_decl);
   basic_block return_bb = find_return_bb ();
   basic_block call_bb;
   gimple_stmt_iterator gsi;
@@ -968,17 +968,30 @@ split_function (struct split_point *spli
   dump_split_point (dump_file, split_point);
 }
 
+  if (cur_node->local.can_change_signature)
+args_to_skip = BITMAP_ALLOC (NULL);
+  else
+args_to_skip = NULL;
+
   /* Collect the parameters of new function and args_to_skip bitmap.  */
   for (parm = DECL_ARGUMENTS (current_function_decl);
parm; parm = DECL_CHAIN (parm), num++)
-if (!is_gimple_reg (parm)
-   || !gimple_default_def (cfun, parm)
-   || !bitmap_bit_p (split_point->ssa_names_to_pass,
- SSA_NAME_VERSION (gimple_default_def (cfun, parm
+if (args_to_skip
+   && (!is_gimple_reg (parm)
+   || !gimple_default_def (cfun, parm)
+   || !bitmap_bit_p (split_point->ssa_names_to_pass,
+ SSA_NAME_VERSION (gimple_default_def (cfun,
+   parm)
   bitmap_set_bit (args_to_skip, num);
 else
   {
arg 

Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-28 Thread Tobias Grosser

On 07/23/2011 12:01 AM, Sebastian Pop wrote:

The CLAST produced by CLooG-ISL contains an assignment and GCC chokes
on it.  The exact CLAST contains an assignment followed by an if:

scat_1 = max(0,ceild(T_4-7,8));
if (scat_1<= min(1,floord(T_4-1,8))) {
   S7(scat_1);
}

This is equivalent to a loop that iterates only once, and so CLooG
generates an assignment followed by an if instead of a loop.  This is
an important optimization that was improved in ISL, that allows
if-conversion: imagine GCC having to figure out that a loop like the
following actually iterates only once, and can be converted to an if:

for (scat_1 = max(0,ceild(T_4-7,8)); scat_1<= min(1,floord(T_4-1,8)); scat_1++)
   S7(scat_1);

This patch implements the translation of CLAST assignments.
Bootstrapped and tested on amd64-linux.


Hi Sebastian,

thanks for adding this to graphite. One comment inline.



Sebastian

2011-07-22  Sebastian Pop

PR middle-end/48648
* graphite-clast-to-gimple.c (clast_get_body_of_loop): Handle
CLAST assignments.
(translate_clast): Same.
(translate_clast_assignment): New.

* gcc.dg/graphite/id-pr48648.c: New.
---
  gcc/ChangeLog  |8 
  gcc/graphite-clast-to-gimple.c |   49 
  gcc/testsuite/ChangeLog|5 +++
  gcc/testsuite/gcc.dg/graphite/id-pr48648.c |   21 
  4 files changed, 83 insertions(+), 0 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/graphite/id-pr48648.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9cfa21b..303c9c9 100644
+/* Translates a clast assignment STMT to gimple.
+
+   - NEXT_E is the edge where new generated code should be attached.
+   - BB_PBB_MAPPING is is a basic_block and it's related poly_bb_p mapping.  */
+
+static edge
+translate_clast_assignment (struct clast_assignment *stmt, edge next_e,
+   int level, ivs_params_p ip)
+{
+  gimple_seq stmts;
+  mpz_t v1, v2;
+  tree type, new_name, var;
+  edge res = single_succ_edge (split_edge (next_e));
+  struct clast_expr *expr = (struct clast_expr *) stmt->RHS;
+  struct clast_user_stmt *body
+= clast_get_body_of_loop ((struct clast_stmt *) stmt);


I am not a big fan of using clast_get_body_of_loop as it is buggy. 
Introducing new uses of it, is nothing what I would support. Do we 
really need this?



+  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);
+
+  mpz_init (v1);
+  mpz_init (v2);


What about some more meaningful names like bound_one, bound_two?


+  type = type_for_clast_expr (expr, ip, v1, v2);
+  var = create_tmp_var (type, "graphite_var");
+  new_name = force_gimple_operand (clast_to_gcc_expression (type, expr, ip),
+   &stmts, true, var);
+  add_referenced_var (var);
+  if (stmts)
+{
+  gsi_insert_seq_on_edge (next_e, stmts);
+  gsi_commit_edge_inserts ();
+}
+
+  compute_bounds_for_level (pbb, level, v1, v2);


Mh. I do not completely understand all the code. But can't we get v1 and 
v2 set without the need for the compute_bounds_for_level function. Is 
the type_for_clast_expression not setting them.


Cheers
Tobi


Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Tom de Vries
On 07/28/2011 06:25 PM, Richard Guenther wrote:
> On Thu, 28 Jul 2011, Tom de Vries wrote:
> 
>> On 07/28/2011 12:22 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 05:27 PM, Richard Guenther wrote:
> On Wed, 27 Jul 2011, Tom de Vries wrote:
>
>> On 07/27/2011 02:12 PM, Richard Guenther wrote:
>>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>>
 On 07/27/2011 01:50 PM, Tom de Vries wrote:
> Hi Richard,
>
> I have a patch set for bug 43513 - The stack pointer is adjusted 
> twice.
>
> 01_pr43513.3.patch
> 02_pr43513.3.test.patch
> 03_pr43513.3.mudflap.patch
>
> The patch set has been bootstrapped and reg-tested on x86_64.
>
> I will sent out the patches individually.
>

 The patch replaces a vla __builtin_alloca that has a constant argument 
 with an
 array declaration.

 OK for trunk?
>>>
>>> I don't think it is safe to try to get at the VLA type the way you do.
>>
>> I don't understand in what way it's not safe. Do you mean I don't manage 
>> to find
>> the type always, or that I find the wrong type, or something else?
>
> I think you might get the wrong type,

 Ok, I'll review that code one more time.

> you also do not transform code
> like
>
>   int *p = alloca(4);
>   *p = 3;
>
> as there is no array type involved here.
>

 I was trying to stay away from non-vla allocas.  A source declared alloca 
 has
 function livetime, so we could have a single alloca in a loop, called 10 
 times,
 with all 10 instances live at the same time. This patch does not detect 
 such
 cases, and thus stays away from non-vla allocas. A vla decl does not have 
 such
 problems, the lifetime ends when it goes out of scope.
>>>
>>> Yes indeed - that probably would require more detailed analysis.
>>>
>>> In fact I would simply do sth like
>>>
>>>   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>>>   n_elem = size * 8 / BITS_PER_UNIT;
>>>   array_type = build_array_type_nelts (elem_type, n_elem);
>>>   var = create_tmp_var (array_type, NULL);
>>>   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
>>>
>>
>> I tried this code on the example, and it works, but the newly declared 
>> type has
>> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>> This make
>> the memory access in the example potentially unaligned, which prohibits 
>> an
>> ivopts optimization, so the resulting text size is 68 instead of the 64 
>> achieved
>> with my current patch.
>
> Ok, so then set DECL_ALIGN of the variable to something reasonable
> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
> alignment that the targets alloca function would guarantee.
>

 I tried that, but that doesn't help. It's the alignment of the type that
 matters, not of the decl.
>>>
>>> It shouldn't.  All accesses are performed with the original types and
>>> alignment comes from that (plus the underlying decl).
>>>
>>
>> I managed to get it all working by using build_aligned_type rather that 
>> DECL_ALIGN.
> 
> That's really odd, DECL_ALIGN should just work - nothing refers to the
> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to 
> 1 maybe?
> 

This doesn't work either.

  /* Declare array.  */
  elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
  n_elem = size * 8 / BITS_PER_UNIT;
  align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
  array_type = build_array_type_nelts (elem_type, n_elem);
  var = create_tmp_var (array_type, NULL);
  DECL_ALIGN (var) = align;
  DECL_USER_ALIGN (var) = 1;

Maybe this clarifies it:

Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
/home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
(gdb) call debug_generic_expr (ref)
MEM[(int[0:D.2579] *)&D.2595][0]
(gdb) call debug_generic_expr (step)
4

1627  base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
(gdb) call debug_generic_expr (base)
D.2595

1629  base_type = TREE_TYPE (base);
(gdb) call debug_generic_expr (base_type)
[40]

1630  base_align = TYPE_ALIGN (base_type);
(gdb) p base_align
$1 = 8

So the align is 8-bits, and we return true here:

(gdb) n
1632  if (mode != BLKmode)
(gdb) n
1634  unsigned mode_align = GET_MODE_ALIGNMENT (mode);
(gdb)
1636  if (base_align < mode_align
(gdb)
1639return true;


Here we can see that the base actually has the (user) align on it:

(gdb) call debug_tree (base)
 
unit size 
align 8 symtab 0 alias set -1 canonical type 0xf7e1b2a0 precision 8
min  max 
pointer_to

[pph] Free buffers used during tree encoding/decoding

2011-07-28 Thread Diego Novillo
Noticed this while debugging the new tree encoding cache.  No
functional changes.  This frees the memory used by the buffers
used during tree streaming.  It also moves the reader and writer
data into a union to better distinguish them.

Tested on x86_64.


Diego.


* pph-streamer.h (pph_stream): Move fields OB, OUT_STATE,
DECL_STATE_STREAM, IB, DATA_IN, PPH_SECTIONS, FILE_DATA and
FILE_SIZE into a union of structures.
Update all users.
* pph-streamer.c (pph_stream_close): Free memory used by tree
encoding routines.

Index: cp/pph-streamer-in.c
===
--- cp/pph-streamer-in.c(revision 176879)
+++ cp/pph-streamer-in.c(working copy)
@@ -109,8 +109,8 @@ pph_get_section_data (struct lto_file_de
 {
   /* FIXME pph - Stop abusing lto_file_decl_data fields.  */
   const pph_stream *stream = (const pph_stream *) file_data->file_name;
-  *len = stream->file_size - sizeof (pph_file_header);
-  return (const char *) stream->file_data + sizeof (pph_file_header);
+  *len = stream->encoder.r.file_size - sizeof (pph_file_header);
+  return (const char *) stream->encoder.r.file_data + sizeof (pph_file_header);
 }
 
 
@@ -119,14 +119,14 @@ pph_get_section_data (struct lto_file_de
 
 static void
 pph_free_section_data (struct lto_file_decl_data *file_data,
-  enum lto_section_type section_type ATTRIBUTE_UNUSED,
-  const char *name ATTRIBUTE_UNUSED,
-  const char *offset ATTRIBUTE_UNUSED,
-  size_t len ATTRIBUTE_UNUSED)
+  enum lto_section_type section_type ATTRIBUTE_UNUSED,
+  const char *name ATTRIBUTE_UNUSED,
+  const char *offset ATTRIBUTE_UNUSED,
+  size_t len ATTRIBUTE_UNUSED)
 {
   /* FIXME pph - Stop abusing lto_file_decl_data fields.  */
   const pph_stream *stream = (const pph_stream *) file_data->file_name;
-  free (stream->file_data);
+  free (stream->encoder.r.file_data);
 }
 
 
@@ -145,46 +145,48 @@ pph_init_read (pph_stream *stream)
 
   lto_reader_init ();
 
-  /* Read STREAM->NAME into the memory buffer STREAM->FILE_DATA.
- FIXME pph, we are reading the whole file at once.  This seems
- wasteful.  */
+  /* Read STREAM->NAME into the memory buffer stream->encoder.r.file_data.  */
   retcode = fstat (fileno (stream->file), &st);
   gcc_assert (retcode == 0);
-  stream->file_size = (size_t) st.st_size;
-  stream->file_data = XCNEWVEC (char, stream->file_size);
-  bytes_read = fread (stream->file_data, 1, stream->file_size, stream->file);
-  gcc_assert (bytes_read == stream->file_size);
+  stream->encoder.r.file_size = (size_t) st.st_size;
+  stream->encoder.r.file_data = XCNEWVEC (char, stream->encoder.r.file_size);
+  bytes_read = fread (stream->encoder.r.file_data, 1,
+ stream->encoder.r.file_size, stream->file);
+  gcc_assert (bytes_read == stream->encoder.r.file_size);
 
   /* Set LTO callbacks to read the PPH file.  */
-  stream->pph_sections = XCNEWVEC (struct lto_file_decl_data *,
-  PPH_NUM_SECTIONS);
+  stream->encoder.r.pph_sections = XCNEWVEC (struct lto_file_decl_data *,
+PPH_NUM_SECTIONS);
   for (i = 0; i < PPH_NUM_SECTIONS; i++)
 {
-  stream->pph_sections[i] = XCNEW (struct lto_file_decl_data);
+  stream->encoder.r.pph_sections[i] = XCNEW (struct lto_file_decl_data);
   /* FIXME pph - Stop abusing fields in lto_file_decl_data.  */
-  stream->pph_sections[i]->file_name = (const char *) stream;
+  stream->encoder.r.pph_sections[i]->file_name = (const char *) stream;
 }
 
-  lto_set_in_hooks (stream->pph_sections, pph_get_section_data,
+  lto_set_in_hooks (stream->encoder.r.pph_sections, pph_get_section_data,
pph_free_section_data);
 
-  header = (pph_file_header *) stream->file_data;
+  header = (pph_file_header *) stream->encoder.r.file_data;
   strtab = (const char *) header + sizeof (pph_file_header);
   strtab_size = header->strtab_size;
   body = strtab + strtab_size;
-  gcc_assert (stream->file_size >= strtab_size + sizeof (pph_file_header));
-  body_size = stream->file_size - strtab_size - sizeof (pph_file_header);
+  gcc_assert (stream->encoder.r.file_size
+ >= strtab_size + sizeof (pph_file_header));
+  body_size = stream->encoder.r.file_size
+ - strtab_size - sizeof (pph_file_header);
 
   /* Create an input block structure pointing right after the string
  table.  */
-  stream->ib = XCNEW (struct lto_input_block);
-  LTO_INIT_INPUT_BLOCK_PTR (stream->ib, body, 0, body_size);
-  stream->data_in = lto_data_in_create (stream->pph_sections[0], strtab,
-strtab_size, NULL);
+  stream->encoder.r.ib = XCNEW (struct lto_input_block);
+  LTO_INIT_INPUT_BLOCK_PTR (stream->encoder.r.ib, body, 0, body_size);
+  stream->en

Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Gabriel Charette
The last_listed field in struct line_maps was never used, removed it.

Gab

2011-07-28  Gabriel Charette  

* libcpp/include/line-map.h (struct line_maps):
  Remove unused field last_listed.

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 3234423..f1d5bee 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -76,11 +76,6 @@ struct GTY(()) line_maps {
 
   unsigned int cache;
 
-  /* The most recently listed include stack, if any, starts with
- LAST_LISTED as the topmost including file.  -1 indicates nothing
- has been listed yet.  */
-  int last_listed;
-
   /* Depth of the include stack, including the current file.  */
   unsigned int depth;
 
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 86e2484..dd3f11c 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
   set->maps = NULL;
   set->allocated = 0;
   set->used = 0;
-  set->last_listed = -1;
   set->trace_includes = false;
   set->depth = 0;
   set->cache = 0;

--
This patch is available for review at http://codereview.appspot.com/4810058


Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread gchare

Forgot to mention:

Tested with bootstrap build and full regression testing.

On 2011/07/28 17:55:15, Gabriel Charette wrote:

The last_listed field in struct line_maps was never used, removed it.



Gab



2011-07-28  Gabriel Charette  



* libcpp/include/line-map.h (struct line_maps):
   Remove unused field last_listed.



diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 3234423..f1d5bee 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -76,11 +76,6 @@ struct GTY(()) line_maps {



unsigned int cache;



-  /* The most recently listed include stack, if any, starts with
- LAST_LISTED as the topmost including file.  -1 indicates nothing
- has been listed yet.  */
-  int last_listed;
-
/* Depth of the include stack, including the current file.  */
unsigned int depth;



diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 86e2484..dd3f11c 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
set->maps = NULL;
set->allocated = 0;
set->used = 0;
-  set->last_listed = -1;
set->trace_includes = false;
set->depth = 0;
set->cache = 0;



--
This patch is available for review at

http://codereview.appspot.com/4810058



http://codereview.appspot.com/4810058/


Re: PING: PATCH [4/n]: Prepare x32: Permute the conversion and addition if one operand is a constant

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 7:59 PM, H.J. Lu  wrote:

>>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>>> >  2011-06-11 ?H.J. Lu ?
>>> >
>>> >  ? ? ? ?PR middle-end/47727
>>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>>
>>> Do we still need this patch? With recent target changes the testcase
>>> from PR can be compiled without problems with a gcc from an unpatched
>>> trunk.
>>
>> Given the communication difficulties, I hope not...
>>
>> Paolo
>>
>
> Here is the updated patch.  OK for trunk?

Did you see the question two levels up the thread you are replying to?

Uros.


PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:05 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 7:59 PM, H.J. Lu  wrote:
>
 >  convert_memory_address_addr_space has a special PLUS/MULT case for
 >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
 >  for all Pmode != ptr_mode cases. ?OK for trunk?
 >  2011-06-11 ?H.J. Lu ?
 >
 >  ? ? ? ?PR middle-end/47727
 >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
 >  ? ? ? ?conversion and addition if one operand is a constant.

 Do we still need this patch? With recent target changes the testcase
 from PR can be compiled without problems with a gcc from an unpatched
 trunk.
>>>
>>> Given the communication difficulties, I hope not...
>>>
>>> Paolo
>>>
>>
>> Here is the updated patch.  OK for trunk?
>
> Did you see the question two levels up the thread you are replying to?
>

The patch is for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721

I changed the thread subject.

-- 
H.J.


PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread H.J. Lu
Hi,

This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

PR target/47766
* config/i386/i386.md (PTR): New.
(stack_protect_set: Check TARGET_LP64 instead of TARGET_64BIT.
(stack_protect_test): Likewise.
(stack_protect_set_): Replace ":P" with ":PTR".
(stack_tls_protect_set_): Likewise.
(stack_tls_protect_test_): Likewise.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f33b8a0..f4717b5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -951,6 +951,11 @@
 ;; This mode iterator allows :P to be used for patterns that operate on
 ;; pointer-sized quantities.  Exactly one of the two alternatives will match.
 (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; This mode iterator allows :PTR to be used for patterns that operate on
+;; ptr_mode sized quantities.
+(define_mode_iterator PTR
+  [(SI "ptr_mode == SImode") (DI "ptr_mode == DImode")])
 
 ;; Scheduling descriptions
 
@@ -17347,11 +17379,11 @@
 
 #ifdef TARGET_THREAD_SSP_OFFSET
   operands[1] = GEN_INT (TARGET_THREAD_SSP_OFFSET);
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_tls_protect_set_di
  : gen_stack_tls_protect_set_si);
 #else
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_protect_set_di
  : gen_stack_protect_set_si);
 #endif
@@ -17361,19 +17393,20 @@
 })
 
 (define_insn "stack_protect_set_"
-  [(set (match_operand:P 0 "memory_operand" "=m")
-   (unspec:P [(match_operand:P 1 "memory_operand" "m")] UNSPEC_SP_SET))
-   (set (match_scratch:P 2 "=&r") (const_int 0))
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")]
+   UNSPEC_SP_SET))
+   (set (match_scratch:PTR 2 "=&r") (const_int 0))
(clobber (reg:CC FLAGS_REG))]
   ""
   "mov{}\t{%1, %2|%2, %1}\;mov{}\t{%2, %0|%0, 
%2}\;xor{l}\t%k2, %k2"
   [(set_attr "type" "multi")])
 
 (define_insn "stack_tls_protect_set_"
-  [(set (match_operand:P 0 "memory_operand" "=m")
-   (unspec:P [(match_operand:P 1 "const_int_operand" "i")]
- UNSPEC_SP_TLS_SET))
-   (set (match_scratch:P 2 "=&r") (const_int 0))
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "const_int_operand" "i")]
+   UNSPEC_SP_TLS_SET))
+   (set (match_scratch:PTR 2 "=&r") (const_int 0))
(clobber (reg:CC FLAGS_REG))]
   ""
   "mov{}\t{%@:%P1, %2|%2,  PTR 
%@:%P1}\;mov{}\t{%2, %0|%0, %2}\;xor{l}\t%k2, %k2"
@@ -17391,11 +17424,11 @@
 
 #ifdef TARGET_THREAD_SSP_OFFSET
   operands[1] = GEN_INT (TARGET_THREAD_SSP_OFFSET);
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_tls_protect_test_di
  : gen_stack_tls_protect_test_si);
 #else
-  insn = (TARGET_64BIT
+  insn = (TARGET_LP64
  ? gen_stack_protect_test_di
  : gen_stack_protect_test_si);
 #endif
@@ -17409,20 +17442,20 @@
 
 (define_insn "stack_protect_test_"
   [(set (match_operand:CCZ 0 "flags_reg_operand" "")
-   (unspec:CCZ [(match_operand:P 1 "memory_operand" "m")
-(match_operand:P 2 "memory_operand" "m")]
+   (unspec:CCZ [(match_operand:PTR 1 "memory_operand" "m")
+(match_operand:PTR 2 "memory_operand" "m")]
UNSPEC_SP_TEST))
-   (clobber (match_scratch:P 3 "=&r"))]
+   (clobber (match_scratch:PTR 3 "=&r"))]
   ""
   "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%2, %3|%3, %2}"
   [(set_attr "type" "multi")])
 
 (define_insn "stack_tls_protect_test_"
   [(set (match_operand:CCZ 0 "flags_reg_operand" "")
-   (unspec:CCZ [(match_operand:P 1 "memory_operand" "m")
-(match_operand:P 2 "const_int_operand" "i")]
+   (unspec:CCZ [(match_operand:PTR 1 "memory_operand" "m")
+(match_operand:PTR 2 "const_int_operand" "i")]
UNSPEC_SP_TLS_TEST))
-   (clobber (match_scratch:P 3 "=r"))]
+   (clobber (match_scratch:PTR 3 "=r"))]
   ""
   "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%@:%P2, %3|%3, 
 PTR %@:%P2}"
   [(set_attr "type" "multi")])


Re: [patch] attribute to reverse bitfield allocations

2011-07-28 Thread DJ Delorie

> Seeing little opposition, I plod further...  now with documentation
> and a test case.  OK yet?

Ping?

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:03 PM, H.J. Lu  wrote:

>> So, instead of huge complications with new mode iterator, just
>> introduce two new patterns that will shadow existing ones for
>> TARGET_X32.
>>
>> Like in attached (untested) patch.
>>
>
> I tried the following patch with typos fixed.  It almost worked,
> except for this failure in glibc testsuite:
>
> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
> -c -f $charmap -i $input ${common_objpfx}localedata/$out
> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
> failed
> make[4]: *** 
> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
> Error 1
>
> I will add:
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 8723dc5..d32d64d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>  {
>   rtx tp, reg, insn;
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>   if (!to_reg)
>     return tp;
>
> since TP must be 32bit.

 No, this won't have the desired effect. It will change the UNSPEC, so
 it won't match patterns in i386.md.

 Can you debug the failure a bit more? With my patterns, add{l} and
 mov{l} should clear top 32bits.

>>>
>>> TP is 32bit in x32  For load_tp_x32, we load SImode value and
>>> zero-extend to DImode. For add_tp_x32, we are adding SImode
>>> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
>>> must take SImode TP.
>>>
>>
>> I will see what I can do.
>>
>
> Here is the updated patch to use 32bit TP for 32.

Why??

This part makes no sense:

-  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
+  if (ptr_mode != Pmode)
+tp = convert_to_mode (Pmode, tp, 1);

You will create zero_extend (unspec ...), that won't be matched by any pattern.

Can you please explain, how is this pattern different than DImode
pattern, proposed in my patch?

+(define_insn "*load_tp_x32"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])

vs:

+(define_insn "*load_tp_x32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI [(const_int 0)] UNSPEC_TP))]
+  "TARGET_X32"
+  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
+  [(set_attr "type" "imov")
+   (set_attr "modrm" "0")
+   (set_attr "length" "7")
+   (set_attr "memory" "load")
+   (set_attr "imm_disp" "false")])

Uros.


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:09 PM, H.J. Lu  wrote:

> >  convert_memory_address_addr_space has a special PLUS/MULT case for
> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
> >  for all Pmode != ptr_mode cases. ?OK for trunk?
> >  2011-06-11 ?H.J. Lu ?
> >
> >  ? ? ? ?PR middle-end/47727
> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
> >  ? ? ? ?conversion and addition if one operand is a constant.
>
> Do we still need this patch? With recent target changes the testcase
> from PR can be compiled without problems with a gcc from an unpatched
> trunk.

 Given the communication difficulties, I hope not...

 Paolo

>>>
>>> Here is the updated patch.  OK for trunk?
>>
>> Did you see the question two levels up the thread you are replying to?
>>
>
> The patch is for
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>
> I changed the thread subject.

Please add testcase to see the patch in action.

Uros.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:21 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:03 PM, H.J. Lu  wrote:
>
>>> So, instead of huge complications with new mode iterator, just
>>> introduce two new patterns that will shadow existing ones for
>>> TARGET_X32.
>>>
>>> Like in attached (untested) patch.
>>>
>>
>> I tried the following patch with typos fixed.  It almost worked,
>> except for this failure in glibc testsuite:
>>
>> gen-locale.sh: line 27: 14755 Aborted                 (core dumped)
>> I18NPATH=. GCONV_PATH=${common_objpfx}iconvdata ${localedef} --quiet
>> -c -f $charmap -i $input ${common_objpfx}localedata/$out
>> Charmap: "ISO-8859-1" Inputfile: "nb_NO" Outputdir: "nb_NO.ISO-8859-1" 
>> failed
>> make[4]: *** 
>> [/export/build/gnu/glibc-x32/build-x86_64-linux/localedata/nb_NO.ISO-8859-1/LC_CTYPE]
>> Error 1
>>
>> I will add:
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 8723dc5..d32d64d 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -12120,7 +12120,9 @@ get_thread_pointer (bool to_reg)
>>  {
>>   rtx tp, reg, insn;
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>   if (!to_reg)
>>     return tp;
>>
>> since TP must be 32bit.
>
> No, this won't have the desired effect. It will change the UNSPEC, so
> it won't match patterns in i386.md.
>
> Can you debug the failure a bit more? With my patterns, add{l} and
> mov{l} should clear top 32bits.
>

 TP is 32bit in x32  For load_tp_x32, we load SImode value and
 zero-extend to DImode. For add_tp_x32, we are adding SImode
 value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
 must take SImode TP.

>>>
>>> I will see what I can do.
>>>
>>
>> Here is the updated patch to use 32bit TP for 32.
>
> Why??
>
> This part makes no sense:
>
> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
> +  if (ptr_mode != Pmode)
> +    tp = convert_to_mode (Pmode, tp, 1);
>
> You will create zero_extend (unspec ...), that won't be matched by any 
> pattern.

No.  I created  zero_exten from (reg:SI) to (reg: DI).

> Can you please explain, how is this pattern different than DImode
> pattern, proposed in my patch?
>
> +(define_insn "*load_tp_x32"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +       (unspec:SI [(const_int 0)] UNSPEC_TP))]
> +  "TARGET_X32"
> +  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
> +  [(set_attr "type" "imov")
> +   (set_attr "modrm" "0")
> +   (set_attr "length" "7")
> +   (set_attr "memory" "load")
> +   (set_attr "imm_disp" "false")])
>
> vs:
>
> +(define_insn "*load_tp_x32"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +       (unspec:DI [(const_int 0)] UNSPEC_TP))]

That is wrong since source (TP)  is 32bit.  This pattern tells compiler
source is 64bit.

> +  "TARGET_X32"
> +  "mov{l}\t{%%fs:0, %k0|%k0, DWORD PTR fs:0}"
> +  [(set_attr "type" "imov")
> +   (set_attr "modrm" "0")
> +   (set_attr "length" "7")
> +   (set_attr "memory" "load")
> +   (set_attr "imm_disp" "false")])
>

I will try zero_extend to see if it works.

Thanks.


-- 
H.J.


Re: PATCH: PR target/47715: [x32] Use SImode for thread pointer

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:30 PM, H.J. Lu  wrote:

> TP is 32bit in x32  For load_tp_x32, we load SImode value and
> zero-extend to DImode. For add_tp_x32, we are adding SImode
> value.  We can't pretend TP is 64bit.  load_tp_x32 and add_tp_x32
> must take SImode TP.
>

 I will see what I can do.

>>>
>>> Here is the updated patch to use 32bit TP for 32.
>>
>> Why??
>>
>> This part makes no sense:
>>
>> -  tp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  tp = gen_rtx_UNSPEC (ptr_mode, gen_rtvec (1, const0_rtx), UNSPEC_TP);
>> +  if (ptr_mode != Pmode)
>> +    tp = convert_to_mode (Pmode, tp, 1);
>>
>> You will create zero_extend (unspec ...), that won't be matched by any 
>> pattern.
>
> No.  I created  zero_exten from (reg:SI) to (reg: DI).
>
>> Can you please explain, how is this pattern different than DImode
>> pattern, proposed in my patch?
>>
>> +(define_insn "*load_tp_x32"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +       (unspec:SI [(const_int 0)] UNSPEC_TP))]
>> +  "TARGET_X32"
>> +  "mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}"
>> +  [(set_attr "type" "imov")
>> +   (set_attr "modrm" "0")
>> +   (set_attr "length" "7")
>> +   (set_attr "memory" "load")
>> +   (set_attr "imm_disp" "false")])
>>
>> vs:
>>
>> +(define_insn "*load_tp_x32"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +       (unspec:DI [(const_int 0)] UNSPEC_TP))]
>
> That is wrong since source (TP)  is 32bit.  This pattern tells compiler
> source is 64bit.

Where?

Uros.


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:23 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:09 PM, H.J. Lu  wrote:
>
>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>> >  2011-06-11 ?H.J. Lu ?
>> >
>> >  ? ? ? ?PR middle-end/47727
>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>
>> Do we still need this patch? With recent target changes the testcase
>> from PR can be compiled without problems with a gcc from an unpatched
>> trunk.
>
> Given the communication difficulties, I hope not...
>
> Paolo
>

 Here is the updated patch.  OK for trunk?
>>>
>>> Did you see the question two levels up the thread you are replying to?
>>>
>>
>> The patch is for
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>
>> I changed the thread subject.
>
> Please add testcase to see the patch in action.
>

I haven't found a testcase yet.  The problem was discovered in
this thread:

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html


-- 
H.J.


Re: [PATCH] Fix PR48648: Handle CLAST assignments.

2011-07-28 Thread Tobias Grosser

On 07/28/2011 06:56 PM, Sebastian Pop wrote:

Hi Tobi,

On Thu, Jul 28, 2011 at 12:13, Tobias Grosser  wrote:

+  struct clast_user_stmt *body
+= clast_get_body_of_loop ((struct clast_stmt *) stmt);


I am not a big fan of using clast_get_body_of_loop as it is buggy.
Introducing new uses of it, is nothing what I would support. Do we really
need this?


No, because of ...




+  poly_bb_p pbb = (poly_bb_p) cloog_statement_usr (body->statement);


What about some more meaningful names like bound_one, bound_two?


Ok, see the second patch attached.


+
+  compute_bounds_for_level (pbb, level, v1, v2);


Mh. I do not completely understand all the code. But can't we get v1 and v2
set without the need for the compute_bounds_for_level function. Is the
type_for_clast_expression not setting them.



... this.
You are right.  type_for_clast_expr would provide the bounds for the
RHS of the assign and so we don't need to compute the bounds on
the loop level, as we would have done on a real loop.  Attached the
amended patch.  I'm regstrapping these patches on amd64-linux.
Ok for trunk after?


Looks good to me. Please commit if it passes regstrapping.

Cheers
Tobi


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:27 AM, Alan Modra wrote:
> On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote:
>> Ideally what I'd like to
>> do is have ld and gcc emit accurate r2 tracking unwind info and
>> dispense with hacks like frob_update_context.  If ld did emit accurate
>> unwind info for .glink, then the justification for frob_update_context
>> disappears.
> 
> For the record, this statement of mine doesn't make sense.  A .glink
> stub doesn't make a frame, so a backtrace won't normally pass through a
> stub, thus having accurate unwind info for .glink doesn't help at all.

It does, for the duration of the stub.

The whole problem is that toc pointer copy in 40(1) is only valid
during indirect call sequences, and iff ld inserted a stub?  I.e.
direct calls between functions that share toc pointers never save
the copy?

Would it make sense, if a function has any indirect call, to move
the toc pointer save into the prologue?  You'd get to avoid that
store all the time.  Of course you'd not be able to sink the load
after the call, but it might still be a win.  And in that special
case you can annotate the r2 save slot just once, correctly.

For functions that do not contain an indirect function call, I
don't believe that there's a any way to use DW_CFA_offset that
is always correct.

One could, however, move the code in frob_update_context into a
(series of) DW_CFA_val_expression's.

  DW_CFA_val_expression
DW_OP_reg2  // Default to the value currently in R2
DW_OP_regx LR   // Test the insn following the call, as per 
frob_update_context
DW_OP_deref_size 4
DW_OP_const4u 0xE8410028
DW_OP_ne
DW_OP_bra L1
DW_OP_drop  // Could be omitted, given that we only examine 
top-of-stack at the end
DW_OP_breg1 40  // Pull the value from *(R1+40)
DW_OP_deref
  L1:

This version could appear in the CIE.  You'd have to adjust it
once LR gets saved to the stack, and R2 isn't itself being saved
as per above.

There isn't currently a hook in dwarf2cfi to add extra stuff to
the CIE program, but that wouldn't be hard to add.  The version
that gets emitted after LR is saved would need a new note as well.
But it all seems fairly tractable to actually implement, if we
think it'll actually solve the problem.


r~


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:32 PM, H.J. Lu  wrote:

>>> >  convert_memory_address_addr_space has a special PLUS/MULT case for
>>> >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
>>> >  for all Pmode != ptr_mode cases. ?OK for trunk?
>>> >  2011-06-11 ?H.J. Lu ?
>>> >
>>> >  ? ? ? ?PR middle-end/47727
>>> >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
>>> >  ? ? ? ?conversion and addition if one operand is a constant.
>>>
>>> Do we still need this patch? With recent target changes the testcase
>>> from PR can be compiled without problems with a gcc from an unpatched
>>> trunk.
>>
>> Given the communication difficulties, I hope not...
>>
>> Paolo
>>
>
> Here is the updated patch.  OK for trunk?

 Did you see the question two levels up the thread you are replying to?

>>>
>>> The patch is for
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>>
>>> I changed the thread subject.
>>
>> Please add testcase to see the patch in action.
>>
>
> I haven't found a testcase yet.  The problem was discovered in
> this thread:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html

This was before x32 could handle SImode addresses. With recent x86
target work, this is no more true, and SImode and DImode addresses are
first-class citizens as far as x32 backend is concerned. Please note
that original testcase (that this whole patch is all about) now
compiles without problems. Also, middle end is shared with at least
two ptr_mode != Pmode targets, and they all work well. So, to see what
makes x32 special, we need a testcase that breaks _WITHOUT_ your
proposed patch. Without testcase, nobody can analyze your approach and
tell if the approach is the right one, if this is in fact target
problem, or indeed a middle-end problem.

And there is no point to flood the mainling-list with patches.

Uros.


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread Uros Bizjak
On Thu, Jul 28, 2011 at 8:13 PM, H.J. Lu  wrote:

> This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?

http://gcc.gnu.org/contribute.html#patches

Uros.


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:49 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:32 PM, H.J. Lu  wrote:
>
 >  convert_memory_address_addr_space has a special PLUS/MULT case for
 >  POINTERS_EXTEND_UNSIGNED<  0. ?It turns out that it is also needed
 >  for all Pmode != ptr_mode cases. ?OK for trunk?
 >  2011-06-11 ?H.J. Lu ?
 >
 >  ? ? ? ?PR middle-end/47727
 >  ? ? ? ?* explow.c (convert_memory_address_addr_space): Permute the
 >  ? ? ? ?conversion and addition if one operand is a constant.

 Do we still need this patch? With recent target changes the testcase
 from PR can be compiled without problems with a gcc from an unpatched
 trunk.
>>>
>>> Given the communication difficulties, I hope not...
>>>
>>> Paolo
>>>
>>
>> Here is the updated patch.  OK for trunk?
>
> Did you see the question two levels up the thread you are replying to?
>

 The patch is for

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721

 I changed the thread subject.
>>>
>>> Please add testcase to see the patch in action.
>>>
>>
>> I haven't found a testcase yet.  The problem was discovered in
>> this thread:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01065.html
>
> This was before x32 could handle SImode addresses. With recent x86
> target work, this is no more true, and SImode and DImode addresses are
> first-class citizens as far as x32 backend is concerned. Please note
> that original testcase (that this whole patch is all about) now
> compiles without problems. Also, middle end is shared with at least
> two ptr_mode != Pmode targets, and they all work well. So, to see what
> makes x32 special, we need a testcase that breaks _WITHOUT_ your
> proposed patch. Without testcase, nobody can analyze your approach and
> tell if the approach is the right one, if this is in fact target
> problem, or indeed a middle-end problem.
>

There are 2 issues

1. rtl_hooks.gen_lowpart_no_emit vs gen_lowpart. simplify-rtx.c shouldn't
generate any new insns.
2. convert_memory_address_addr_space shouldn't permute conversion and
addition.


-- 
H.J.


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread David Edelsohn
On Thu, Jul 28, 2011 at 2:49 PM, Richard Henderson  wrote:

> The whole problem is that toc pointer copy in 40(1) is only valid
> during indirect call sequences, and iff ld inserted a stub?  I.e.
> direct calls between functions that share toc pointers never save
> the copy?
>
> Would it make sense, if a function has any indirect call, to move
> the toc pointer save into the prologue?  You'd get to avoid that
> store all the time.  Of course you'd not be able to sink the load
> after the call, but it might still be a win.  And in that special
> case you can annotate the r2 save slot just once, correctly.

Michael Meissner recently did move R2 save into the prologue, under
certain circumstances.  See TARGET_SAVE_TOC_INDIRECT.  Limitations
include alloca (unless one re-copies the R2.  Mike also encountered
some problems with EH, which may be related to this discussion.

The other problem is hoisting the store into the prologue is not
always profitable for performance.  It should be better once shrink
wrapping is implemented.  Currently the PPC ABI may perform a lot of
stores in the prologue if the function *may* make a call.  R2 adds yet
another store to the common path.

- David


Re: PATCH: PR target/47766: [x32] -fstack-protector doesn't work

2011-07-28 Thread H.J. Lu
On Thu, Jul 28, 2011 at 11:55 AM, Uros Bizjak  wrote:
> On Thu, Jul 28, 2011 at 8:13 PM, H.J. Lu  wrote:
>
>> This patch adds x32 support to UNSPEC_SP_XXX patterns.  OK for trunk?
>
> http://gcc.gnu.org/contribute.html#patches
>

Sorry. I should have mentioned testcase in:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47766

Actually, they are in gcc testsuite.  I noticed them when
I run gcc testsuite on x32.

-- 
H.J.


Re: [RS6000] asynch exceptions and unwind info

2011-07-28 Thread Richard Henderson
On 07/28/2011 12:02 PM, David Edelsohn wrote:
> The other problem is hoisting the store into the prologue is not
> always profitable for performance.  It should be better once shrink
> wrapping is implemented.  Currently the PPC ABI may perform a lot of
> stores in the prologue if the function *may* make a call.  R2 adds yet
> another store to the common path.

Well, even if we're not able to hoist the R2 store, we may be able
to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns
in the stream.


r~


Re: [PATCH] [google] [annotalysis] Fix remove operation from pointer_set in case of hash collisions

2011-07-28 Thread Richard Henderson
On 07/26/2011 04:13 PM, Delesley Hutchins wrote:
> This patch fixes a bug in pointer_set.c, where removing a pointer from
> a pointer set would corrupt the hash table if the pointer was involved
> in any hash collisions.
> 
> Bootstrapped and passed gcc regression testsuite on x86_64-unknown-linux-gnu.
> 
> Okay for google/gcc-4_6?
> 
>   -DeLesley
> 
> gcc/Changelog.annotalysis:
> 2011-7-26  DeLesley Hutchins  
> 
> * gcc/pointer-set.c (pointer_set_delete)  bugfix for case of
> hash collisions

The logic of the patch looks good.  I certainly agree it's a real bug.

> +  /* find location of p */

But please fix up all the comments to be properly punctuated sentences.

> +  pset->slots[n] = 0;  /* remove ptr from set. */

And avoid end-of-line comments.  Put it on the previous line.


r~


Re: [PATCH] Improve call site argument debug info for floating point stack arguments (PR debug/49846)

2011-07-28 Thread Richard Henderson
On 07/26/2011 01:19 PM, Jakub Jelinek wrote:
>   PR debug/49846
>   * var-tracking.c (prepare_call_arguments): For non-MODE_INT stack
>   arguments also check if they aren't initialized with a MODE_INT
>   mode of the same size.

Ok.


r~


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread Richard Henderson
On 07/26/2011 12:52 PM, DJ Delorie wrote:
> This patch tests for at least one user-caused reason for this
> assertion failing - requiring a local frame in a naked function.  For
> this case at least, it would be better to trigger an error than to
> ICE.  OK?
> 
> static int bar;
> void __attribute__((naked)) function(void) {
>int foo, result;
>result = subFunction(&foo, &bar);   // ICE here
> }
> 
>   * expr.c (expand_expr_addr_expr_1): Detect a user request for
>   a local frame in a naked function, and produce a suitable
>   error for that specific case.

Err... why would naked affect anything at all this early in the
compilation path?

Without an answer to that, this seems like an odd place to patch.



r~


Re: [PATCH, PR 49886] Prevent fnsplit from changing signature when there are type attributes

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 6:52 PM, Martin Jambor  wrote:
> Hi,
>
> pass_split_functions is happy to split functions which have type
> attributes but cannot update them if the new clone has in any way
> different parameters than the original.  This can lead to
> miscompilations in cases like the testcase.
>
> This patch solves it by 1) making the inliner set the
> can_change_signature flag to false for them because their signature
> cannot be changed (this step is also necessary to make IPA-CP operate
> on them and handle them correctly), and 2) make the splitting pass
> keep all parameters if the flag is set.  The second step might involve
> inventing some default definitions if the parameters did not really
> have any.
>
> I spoke about this with Honza and he claimed that the new function is
> really an entirely different thing and that the parameters may
> correspond only very loosely and thus the type attributes should be
> cleared.  I'm not sure I agree, but in any case I needed this to work

For sure some attributes may be worth to preserve/change for optimziation
purposes.  Now, if a function is clonable then clearing all attributes from
the clone should be ok - there may be attributes that prevent cloning though
(or rather, need to be preserved).

Richard.

> to allow me continue with promised IPA-CP polishing and so I decided
> to do this because it was easier.  (My own opinion is that the current
> representation of parameter-describing function type attributes is
> evil and will cause harm no matter hat we do.)
>
> A very similar patch has passed bootstrap and testsuite on
> x86_64-linux, the current one is undergoing both right now.  OK for
> trunk if it passes?
>
> Thanks,
>
> Martin
>
>
>
> 2011-07-28  Martin Jambor  
>
>        PR middle-end/49886
>        * ipa-inline-analysis.c (compute_inline_parameters): Set
>        can_change_signature of noes with typde attributes.
>        * ipa-split.c (split_function): Do not skip any arguments if
>        can_change_signature is set.
>
>        * testsuite/gcc.c-torture/execute/pr49886.c: New testcase.
>
> Index: src/gcc/ipa-inline-analysis.c
> ===
> --- src.orig/gcc/ipa-inline-analysis.c
> +++ src/gcc/ipa-inline-analysis.c
> @@ -1658,18 +1658,24 @@ compute_inline_parameters (struct cgraph
>   /* Can this function be inlined at all?  */
>   info->inlinable = tree_inlinable_function_p (node->decl);
>
> -  /* Inlinable functions always can change signature.  */
> -  if (info->inlinable)
> -    node->local.can_change_signature = true;
> +  /* Type attributes can use parameter indices to describe them.  */
> +  if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl)))
> +    node->local.can_change_signature = false;
>   else
>     {
> -      /* Functions calling builtin_apply can not change signature.  */
> -      for (e = node->callees; e; e = e->next_callee)
> -       if (DECL_BUILT_IN (e->callee->decl)
> -           && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
> -           && DECL_FUNCTION_CODE (e->callee->decl) == BUILT_IN_APPLY_ARGS)
> -         break;
> -      node->local.can_change_signature = !e;
> +      /* Otherwise, inlinable functions always can change signature.  */
> +      if (info->inlinable)
> +       node->local.can_change_signature = true;
> +      else
> +       {
> +         /* Functions calling builtin_apply can not change signature.  */
> +         for (e = node->callees; e; e = e->next_callee)
> +           if (DECL_BUILT_IN (e->callee->decl)
> +               && DECL_BUILT_IN_CLASS (e->callee->decl) == BUILT_IN_NORMAL
> +               && DECL_FUNCTION_CODE (e->callee->decl) == 
> BUILT_IN_APPLY_ARGS)
> +             break;
> +         node->local.can_change_signature = !e;
> +       }
>     }
>   estimate_function_body_sizes (node, early);
>
> Index: src/gcc/ipa-split.c
> ===
> --- src.orig/gcc/ipa-split.c
> +++ src/gcc/ipa-split.c
> @@ -945,10 +945,10 @@ static void
>  split_function (struct split_point *split_point)
>  {
>   VEC (tree, heap) *args_to_pass = NULL;
> -  bitmap args_to_skip = BITMAP_ALLOC (NULL);
> +  bitmap args_to_skip;
>   tree parm;
>   int num = 0;
> -  struct cgraph_node *node;
> +  struct cgraph_node *node, *cur_node = cgraph_get_node 
> (current_function_decl);
>   basic_block return_bb = find_return_bb ();
>   basic_block call_bb;
>   gimple_stmt_iterator gsi;
> @@ -968,17 +968,30 @@ split_function (struct split_point *spli
>       dump_split_point (dump_file, split_point);
>     }
>
> +  if (cur_node->local.can_change_signature)
> +    args_to_skip = BITMAP_ALLOC (NULL);
> +  else
> +    args_to_skip = NULL;
> +
>   /* Collect the parameters of new function and args_to_skip bitmap.  */
>   for (parm = DECL_ARGUMENTS (current_function_decl);
>        parm; parm = DECL_CHAIN (parm), num++)
> -    if (!is_gimple_reg (parm)
> -       || !gimple_default_def (cfun, parm)
> -     

Re: [PATCH PR43513, 1/3] Replace vla with array - Implementation.

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 7:20 PM, Tom de Vries  wrote:
> On 07/28/2011 06:25 PM, Richard Guenther wrote:
>> On Thu, 28 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/28/2011 12:22 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 05:27 PM, Richard Guenther wrote:
>> On Wed, 27 Jul 2011, Tom de Vries wrote:
>>
>>> On 07/27/2011 02:12 PM, Richard Guenther wrote:
 On Wed, 27 Jul 2011, Tom de Vries wrote:

> On 07/27/2011 01:50 PM, Tom de Vries wrote:
>> Hi Richard,
>>
>> I have a patch set for bug 43513 - The stack pointer is adjusted 
>> twice.
>>
>> 01_pr43513.3.patch
>> 02_pr43513.3.test.patch
>> 03_pr43513.3.mudflap.patch
>>
>> The patch set has been bootstrapped and reg-tested on x86_64.
>>
>> I will sent out the patches individually.
>>
>
> The patch replaces a vla __builtin_alloca that has a constant 
> argument with an
> array declaration.
>
> OK for trunk?

 I don't think it is safe to try to get at the VLA type the way you do.
>>>
>>> I don't understand in what way it's not safe. Do you mean I don't 
>>> manage to find
>>> the type always, or that I find the wrong type, or something else?
>>
>> I think you might get the wrong type,
>
> Ok, I'll review that code one more time.
>
>> you also do not transform code
>> like
>>
>>   int *p = alloca(4);
>>   *p = 3;
>>
>> as there is no array type involved here.
>>
>
> I was trying to stay away from non-vla allocas.  A source declared alloca 
> has
> function livetime, so we could have a single alloca in a loop, called 10 
> times,
> with all 10 instances live at the same time. This patch does not detect 
> such
> cases, and thus stays away from non-vla allocas. A vla decl does not have 
> such
> problems, the lifetime ends when it goes out of scope.

 Yes indeed - that probably would require more detailed analysis.

 In fact I would simply do sth like

   elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
   n_elem = size * 8 / BITS_PER_UNIT;
   array_type = build_array_type_nelts (elem_type, n_elem);
   var = create_tmp_var (array_type, NULL);
   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));

>>>
>>> I tried this code on the example, and it works, but the newly declared 
>>> type has
>>> an 8-bit alignment, while the vla base type has a 32 bit alignment.  
>>> This make
>>> the memory access in the example potentially unaligned, which prohibits 
>>> an
>>> ivopts optimization, so the resulting text size is 68 instead of the 64 
>>> achieved
>>> with my current patch.
>>
>> Ok, so then set DECL_ALIGN of the variable to something reasonable
>> like MIN (size * 8, GET_MODE_PRECISION (word_mode)).  Basically the
>> alignment that the targets alloca function would guarantee.
>>
>
> I tried that, but that doesn't help. It's the alignment of the type that
> matters, not of the decl.

 It shouldn't.  All accesses are performed with the original types and
 alignment comes from that (plus the underlying decl).

>>>
>>> I managed to get it all working by using build_aligned_type rather that 
>>> DECL_ALIGN.
>>
>> That's really odd, DECL_ALIGN should just work - nothing refers to the
>> type of the decl in the IL.  Can you try also setting DECL_USER_ALIGN to
>> 1 maybe?
>>
>
> This doesn't work either.
>
>  /* Declare array.  */
>  elem_type = build_nonstandard_integer_type (BITS_PER_UNIT, 1);
>  n_elem = size * 8 / BITS_PER_UNIT;
>  align = MIN (size * 8, GET_MODE_PRECISION (word_mode));
>  array_type = build_array_type_nelts (elem_type, n_elem);
>  var = create_tmp_var (array_type, NULL);
>  DECL_ALIGN (var) = align;
>  DECL_USER_ALIGN (var) = 1;
>
> Maybe this clarifies it:
>
> Breakpoint 1, may_be_unaligned_p (ref=0xf7d9d410, step=0xf7d3d578) at
> /home/vries/local/google/src/gcc-mainline/gcc/tree-ssa-loop-ivopts.c:1621
> (gdb) call debug_generic_expr (ref)
> MEM[(int[0:D.2579] *)&D.2595][0]
> (gdb) call debug_generic_expr (step)
> 4
>
> 1627      base = get_inner_reference (ref, &bitsize, &bitpos, &toffset, &mode,
> (gdb) call debug_generic_expr (base)
> D.2595
>
> 1629      base_type = TREE_TYPE (base);
> (gdb) call debug_generic_expr (base_type)
> [40]
>
> 1630      base_align = TYPE_ALIGN (base_type);
> (gdb) p base_align
> $1 = 8
>
> So the align is 8-bits, and we return true here:

Ah, but this code should use get_object_alignment, not solely look
at the type.

Richard.

> (gdb) n
> 1632      if (mode != BLKmode)
> (gdb) n
> 1634          unsigned mode_align = GET_MODE_ALIGNMENT (mode);
> (gdb)
> 1636          if 

PATCH: Add x32 support to libgomp

2011-07-28 Thread H.J. Lu
Hi,

This patch fixes 2 issues in libgomp for x32:

1. x32 should use the same futex functions as x86-64.
2. IA32 tests should check ia32 instead of ilp32.

OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

* config/linux/x86/futex.h: Check __x86_64__ instead of
__LP64__.

* testsuite/lib/libgomp.exp (libgomp_init): Add -march=i486
for ia32 instead of ilp32.

* testsuite/libgomp.c/atomic-1.c: Require ia32 instead of ilp32.
* testsuite/libgomp.c/atomic-6.c: Likewise.

diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h
index cb7461d..419f4d9 100644
--- a/libgomp/config/linux/x86/futex.h
+++ b/libgomp/config/linux/x86/futex.h
@@ -24,7 +24,7 @@
 
 /* Provide target-specific access to the futex system call.  */
 
-#ifdef __LP64__
+#ifdef __x86_64__
 # ifndef SYS_futex
 #  define SYS_futex202
 # endif
@@ -138,7 +138,7 @@ futex_wake (int *addr, int count)
 }
 }
 
-#endif /* __LP64__ */
+#endif /* __x86_64__ */
 
 static inline void
 cpu_relax (void)
diff --git a/libgomp/testsuite/lib/libgomp.exp 
b/libgomp/testsuite/lib/libgomp.exp
index 976543d..a75e22f 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -140,7 +140,7 @@ proc libgomp_init { args } {
 
 # We use atomic operations in the testcases to validate results.
 if { ([istarget i?86-*-*] || [istarget x86_64-*-*])
-&& [check_effective_target_ilp32] } {
+&& [check_effective_target_ia32] } {
lappend ALWAYS_CFLAGS "additional_flags=-march=i486"
 }
 
diff --git a/libgomp/testsuite/libgomp.c/atomic-1.c 
b/libgomp/testsuite/libgomp.c/atomic-1.c
index b2be8f0..4725b7d 100644
--- a/libgomp/testsuite/libgomp.c/atomic-1.c
+++ b/libgomp/testsuite/libgomp.c/atomic-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -march=pentium" { target { { i?86-*-* x86_64-*-* } && 
ilp32 } } } */
+/* { dg-options "-O2 -march=pentium" { target { { i?86-*-* x86_64-*-* } && 
ia32 } } } */
 
 #ifdef __i386__
 #include "cpuid.h"
diff --git a/libgomp/testsuite/libgomp.c/atomic-6.c 
b/libgomp/testsuite/libgomp.c/atomic-6.c
index 59baf7d..f8ab75e 100644
--- a/libgomp/testsuite/libgomp.c/atomic-6.c
+++ b/libgomp/testsuite/libgomp.c/atomic-6.c
@@ -1,7 +1,7 @@
 /* PR middle-end/36106 */
 /* { dg-options "-O2" } */
 /* { dg-options "-O2 -mieee" { target alpha*-*-* } } */
-/* { dg-options "-O2 -march=i586" { target { { i?86-*-* x86_64-*-* } && ilp32 
} } } */
+/* { dg-options "-O2 -march=i586" { target { { i?86-*-* x86_64-*-* } && ia32 } 
} } */
 
 #ifdef __i386__
 # include "cpuid.h"


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Aldy Hernandez



   if (TREE_CODE (to) == COMPONENT_REF
   &&  DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
 get_bit_range (&bitregion_start,&bitregion_end,
to, tem, bitpos, bitsize);

and shouldn't this test DECL_BIT_FIELD instead of DECL_BIT_FIELD_TYPE?


As I mentioned here:

http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01416.html

I am using DECL_BIT_FIELD_TYPE instead of DECL_BIT_FIELD to determine if 
a DECL is a bit field because DECL_BIT_FIELD is not set for bit fields 
with mode sized number of bits (32-bits, 16-bits, etc).


Re: [trans-mem] Beginning of refactoring

2011-07-28 Thread Torvald Riegel
On Thu, 2011-07-28 at 09:02 -0700, Richard Henderson wrote:
> > Add information to dispatch about closed nesting and uninstrumented 
> > code.
> > 
> > * dispatch.h (GTM::abi_dispatch): Add 
> > can_run_uninstrumented_code and
> > closed_nesting flags, as well as a closed nesting alternative.
> > * method-serial.cc: Same.
> 
> Nearly...
> 
> > +  virtual abi_dispatch* closed_nesting_alternative()
> > +  {
> > +// For nested transactions with an instrumented code path, we can do
> > +// undo logging.
> > +return GTM::dispatch_serial();
> 
> Surely you really mean dispatch_serial_ul here?
> Otherwise ok.

No, this is correct because it calls the factory function in libitm_i.h.
However, the classes in method-serial.cc were named differently than
those factory functions, so I renamed them like this:

-class serial_dispatch : public abi_dispatch
+class serialirr_dispatch : public abi_dispatch

-class serial_dispatch_ul : public abi_dispatch
+class serial_dispatch : public abi_dispatch

This should avoid confusion in the future.


> > Add closed nesting as restart reason.
> > 
> > * libitm_i.h: Add closed nesting as restart reason.
> > * retry.cc (GTM::gtm_transaction::decide_retry_strategy): Same.
> 
> Ok, except
> 
> > +  if (r == RESTART_CLOSED_NESTING) retry_serial = true;
> 
> Coding style.  THEN statement on the next line, even for small THEN.

Will try to keep that in mind ...

> > Make flat nesting the default, use closed nesting on demand.
> > 
> > * local.cc (gtm_transaction::rollback_local): Support closed 
> > nesting.
> > * eh_cpp.cc (GTM::gtm_transaction::revert_cpp_exceptions): Same.
> > * dispatch.h: Same.
> > * method-serial.cc: Same.
> > * beginend.cc (GTM::gtm_transaction::begin_transaction): Change 
> > to
> > flat nesting as default, and closed nesting on demand.
> > (GTM::gtm_transaction::rollback): Same.
> > (_ITM_abortTransaction): Same.
> > (GTM::gtm_transaction::restart): Same.
> > (GTM::gtm_transaction::trycommit): Same.
> > (GTM::gtm_transaction::trycommit_and_finalize): Removed.
> > (choose_code_path): New.
> > (GTM::gtm_transaction_cp::save): New.
> > (GTM::gtm_transaction_cp::commit): New.
> > * query.cc (_ITM_inTransaction): Support flat nesting.
> > * libitm_i.h (GTM::gtm_transaction_cp): New helper struct for 
> > nesting.
> > (GTM::gtm_transaction): Support flat and closed nesting.
> > * alloc.cc (commit_allocations_2): New.
> > (commit_cb_data): New helper struct.
> > (GTM::gtm_transaction::commit_allocations): Handle nested
> > commits/rollbacks.
> > * libitm.texi: Update user action section, add description of 
> > nesting.
> 
> Nearly...
> 
> > +  abi_dispatch *cn_disp = disp->closed_nesting_alternative();
> > +  if (cn_disp)
> > +{
> > +  disp = cn_disp;
> > +  set_abi_disp(disp);
> > +}
> 
> Don't we need to fini the old disp?  Seems there's a leak here, though
> not visible until we re-instate the non-serial methods.

Yes, probably. However, one of the next steps on my refactoring list is
to document and change the TM method lifecycle callbacks. This will
include grouping several compatible methods (ie, those that can run
together) into method sets (e.g., global lock, multiple locks).
Switching a method within the current method set would then require no
fini(), whereas switching the method set would require a more
heavy-weight callback.

I have put the case you raised on my to-do list, and will revisit it
when working on these lifecycle management changes.

> 
> > +  if (!(tx->state & STATE_IRREVOCABLE)) ret |= a_saveLiveVariables;

Fixed.

Committing to branch, together with the two more recent changes.


Torvald



Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Dodji Seketeli
Hello Gabriel,

gch...@google.com (Gabriel Charette) a écrit:

> 2011-07-28  Gabriel Charette  
>
>   * libcpp/include/line-map.h (struct line_maps):
>   Remove unused field last_listed.
>

I cannot approve or reject this patch, but FWIW, it looks good (and
obvious) to me.  I am CC-ing Tom.

Thanks.

> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 3234423..f1d5bee 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -76,11 +76,6 @@ struct GTY(()) line_maps {
>  
>unsigned int cache;
>  
> -  /* The most recently listed include stack, if any, starts with
> - LAST_LISTED as the topmost including file.  -1 indicates nothing
> - has been listed yet.  */
> -  int last_listed;
> -
>/* Depth of the include stack, including the current file.  */
>unsigned int depth;
>  
> diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> index 86e2484..dd3f11c 100644
> --- a/libcpp/line-map.c
> +++ b/libcpp/line-map.c
> @@ -34,7 +34,6 @@ linemap_init (struct line_maps *set)
>set->maps = NULL;
>set->allocated = 0;
>set->used = 0;
> -  set->last_listed = -1;
>set->trace_includes = false;
>set->depth = 0;
>set->cache = 0;
>
> --
> This patch is available for review at http://codereview.appspot.com/4810058

-- 
Dodji


Re: [patch] arm,rx: don't ICE on naked functions with local vars

2011-07-28 Thread DJ Delorie

Naked is related to TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS - ARM and RX
set the latter based on the former, and nobody else uses that target
hook.  So, naked functions don't have stack slots for args.  Without
stack slots, args can't be assigned to memory locations - even if
they're TREE_ADDRESSABLE.


Re: Remove unused line_maps field last_listed (issue4810058)

2011-07-28 Thread Tom Tromey
> "Gabriel" == Gabriel Charette  writes:

Gabriel> 2011-07-28  Gabriel Charette  
Gabriel>* libcpp/include/line-map.h (struct line_maps):
Gabriel>   Remove unused field last_listed.

Ok.

Tom


Re: [C++0x] contiguous bitfields race implementation

2011-07-28 Thread Richard Guenther
On Thu, Jul 28, 2011 at 9:12 PM, Aldy Hernandez  wrote:
>
>> Yes.  Together with the above it looks then optimal.
>
> Attached patch tested on x86-64 Linux.
>
> OK for mainline?

Ok with the || moved to the next line as per coding-standards.

Thanks,
Richard.


PATCH: Fix config/i386/morestack.S for x32

2011-07-28 Thread H.J. Lu
Hi Ian,

x32 is similar to x86-64 with 32bit pointer size.  This patch adds x32
support to config/i386/morestack.S.  Tested on x32. OK for trunk?

Thanks.


H.J.
---
2011-07-28  H.J. Lu  

* config/i386/morestack.S: Properly save the x32 new stack
boundary.  Properly check __x86_64__ and __LP64__.

diff --git a/libgcc/config/i386/morestack.S b/libgcc/config/i386/morestack.S
index 16279c7..a745230 100644
--- a/libgcc/config/i386/morestack.S
+++ b/libgcc/config/i386/morestack.S
@@ -353,7 +353,11 @@ __morestack:
# FIXME: The offset must match
# TARGET_THREAD_SPLIT_STACK_OFFSET in
# gcc/config/i386/linux64.h.
+#ifdef __LP64__
movq%rax,%fs:0x70   # Save the new stack boundary.
+#else
+   movl%eax,%fs:0x40   # Save the new stack boundary.
+#endif
 
call__morestack_unblock_signals
 
@@ -391,7 +395,11 @@ __morestack:
subq0(%rsp),%rax# Subtract available space.
addq$BACKOFF,%rax   # Back off 1024 bytes.
 .LEHE0:
+#ifdef __LP64__
movq%rax,%fs:0x70   # Save the new stack boundary.
+#else
+   movl%eax,%fs:0x40   # Save the new stack boundary.
+#endif
 
addq$16,%rsp# Remove values from stack.
 
@@ -433,7 +441,11 @@ __morestack:
movq%rbp,%rcx   # Get the stack pointer.
subq%rax,%rcx   # Subtract available space.
addq$BACKOFF,%rcx   # Back off 1024 bytes.
+#ifdef __LP64__
movq%rcx,%fs:0x70   # Save new stack boundary.
+#else
+   movl%ecx,%fs:0x40   # Save new stack boundary.
+#endif
movq(%rsp),%rdi # Restore exception data for call.
 #ifdef __PIC__
call_Unwind_Resume@PLT  # Resume unwinding.
@@ -493,7 +505,7 @@ __x86.get_pc_thunk.bx:
.section 
.data.DW.ref.__gcc_personality_v0,"awG",@progbits,DW.ref.__gcc_personality_v0,comdat
.type   DW.ref.__gcc_personality_v0, @object
 DW.ref.__gcc_personality_v0:
-#ifndef __x86_64
+#ifndef __LP64__
.align 4
.size   DW.ref.__gcc_personality_v0, 4
.long   __gcc_personality_v0
@@ -504,7 +516,7 @@ DW.ref.__gcc_personality_v0:
 #endif
 #endif
 
-#ifdef __x86_64__
+#if defined __x86_64__ && defined __LP64__
 
 # This entry point is used for the large model.  With this entry point
 # the upper 32 bits of %r10 hold the argument size and the lower 32
@@ -537,7 +549,7 @@ __morestack_large_model:
.size   __morestack_large_model, . - __morestack_large_model
 #endif
 
-#endif /* __x86_64__ */
+#endif /* __x86_64__ && __LP64__ */
 
 # Initialize the stack test value when the program starts or when a
 # new thread starts.  We don't know how large the main stack is, so we
@@ -570,7 +582,11 @@ __stack_split_initialize:
 #else /* defined(__x86_64__) */
 
leaq-16000(%rsp),%rax   # We should have at least 16K.
+#ifdef __LP64__
movq%rax,%fs:0x70
+#else
+   movl%eax,%fs:0x40
+#endif
movq%rsp,%rdi
movq$16000,%rsi
 #ifdef __PIC__
@@ -592,7 +608,7 @@ __stack_split_initialize:
 
.section.ctors.65535,"aw",@progbits
 
-#ifndef __x86_64__
+#ifndef __LP64__
.align  4
.long   __stack_split_initialize
.long   __morestack_load_mmap


PATCH: Use long long for 64bit int in config/i386/64/sfp-machine.h

2011-07-28 Thread H.J. Lu
Hi Ian,

For 64bit x86 targets, long is 32bit for x32 and win64.  But long long
is always 64bit.  This patch removes _WIN64 check.  OK for trunk?

Thanks.


H.J.
---
2010-07-28  H.J. Lu  

* config/i386/64/sfp-machine.h (_FP_W_TYPE): Always use _WIN64
version.
(_FP_WS_TYPE): Likewise.
(_FP_I_TYPE): Likewise.

diff --git a/libgcc/config/i386/64/sfp-machine.h 
b/libgcc/config/i386/64/sfp-machine.h
index 5adf6db..5debf5a 100644
--- a/libgcc/config/i386/64/sfp-machine.h
+++ b/libgcc/config/i386/64/sfp-machine.h
@@ -1,14 +1,8 @@
 #define _FP_W_TYPE_SIZE64
 
-#ifdef _WIN64
- #define _FP_W_TYPEunsigned long long
- #define _FP_WS_TYPE   signed long long
- #define _FP_I_TYPElong long
-#else
- #define _FP_W_TYPEunsigned long
- #define _FP_WS_TYPE   signed long
- #define _FP_I_TYPElong
-#endif
+#define _FP_W_TYPE unsigned long long
+#define _FP_WS_TYPEsigned long long
+#define _FP_I_TYPE long long
 
 typedef int TItype __attribute__ ((mode (TI)));
 typedef unsigned int UTItype __attribute__ ((mode (TI)));


  1   2   >