Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld

2013-01-28 Thread Michael Haubenwallner

On 01/27/2013 03:16 AM, David Edelsohn wrote:
> On Fri, Jan 25, 2013 at 8:55 AM, Michael Haubenwallner
>  wrote:
> 
>> Same here, building everything out-of-source. The prerequisites used are:
>> * CONFIG_SHELL=/usr/local/bin/bash 4.1.7 from bullfreeware (symlinks to 
>> /opt/freeware/bin/)
>> * /usr/bin/{gcc,g++} 4.6.1 from bullfreeware (symlinks to /opt/freeware/bin/)
>> * /usr/bin/gmake 3.82 from bullfreeware (symlinks to /opt/freeware/bin/)
>> * gmp-5.0.4: as shared library, configured with --prefix=/prereq ABI=32
>> * mpfr-3.1.1: as shared library, configured with --prefix=/prereq 
>> --with-gmp=/prereq
>> * mpfr-3.1.1: as shared library, configured with --prefix=/prereq 
>> --with-{gmp,mpfr}=/prereq
>> * gawk-3.1.7, flex-2.5.35, m4-1.4.13 from some Gentoo Prefix instance, 
>> nowhere in PATH,
>>   thus: export {AWK,FLEX}=/gentoo/prefix/usr/bin/{awk,flex} and this patch:
>>   http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00960.html
>>
>> For gcc:
>> * $CONFIG_SHELL configure --prefix=/does/not/exist/yet 
>> --with-{gmp,mpfr,mpc}=/prereq \
>> --enable--languages=c,c++ --disable-werror --disable-nls
>> * gmake bootstrap
> 
> I committed your patch.

Thank you!

But still curious if you've been able to reproduce the problem,
and why you didn't encounter this problem beforehand.

> By the way, NLS works if you build and install GNU libiconv (1.14) and
> add --with-libiconv-prefix=/prereq to force GCC bootstrap to use GNU
> libiconv instead of AIX libiconv.

Yes, but (you've asked) here is this situation I don't want to configure extra 
deplib-prefixes
for (remember bullfreeware is listed as provider for gcc-binaries):

* bullfreeware's libiconv-1.13.1 and gettext-0.17 is installed in /opt/freeware,
* /usr/lib/libintl.a is symlinked to /opt/freeware/lib (by bullfreeware's RPM),
* /usr/lib/libiconv.a is the original AIX' one.

Now, /usr/lib/libintl.a needs /opt/freeware/lib/libiconv.a[libiconv.so.2], and 
it does
contain the correct RUNPATH. But subsequent binaries linking against 
/usr/lib/libintl.a
don't (necessarily) know about the need to add /opt/freeware/lib as RUNPATH, so 
these
binaries break with libiconv.so.2 not being found as member of 
/usr/lib/libiconv.a, because
AIX unfortunately does stop its shared-library search at the first archive 
filename found.

This also is the main reason for my filename-based-shared-library-versioning 
thing.

While this topic is related, it has different reasoning - but the result does 
work:
[1] http://www.perzl.org/aix/index.php?n=FAQs.FAQs#toolbox-compatibility-issue

/haubi/


RFA: RL78: Allow SP to be used as a base register

2013-01-28 Thread Nick Clifton
Hi DJ,

  Please may I apply the patch below.  It fixes the RL78 backend so that
  the stack register can be used as a base address register.

  Tested with no regressions on an rl78-elf toolchain.

Cheers
  Nick

PS.  I am currently investigating allow r8-r15 to be used as base
registers.

gcc/ChangeLog
2013-01-28  Nick Clifton  

* config/rl78/rl78.c (rl78_regno_mode_code_ok_for_base_p): Allow
SP_REG. 

Index: gcc/config/rl78/rl78.c
===
--- gcc/config/rl78/rl78.c  (revision 195461)
+++ gcc/config/rl78/rl78.c  (working copy)
@@ -769,7 +769,7 @@
addr_space_t address_space ATTRIBUTE_UNUSED,
int outer_code ATTRIBUTE_UNUSED, int 
index_code)
 {
-  if (regno < 24 && regno >= 16)
+  if (regno <= SP_REG && regno >= 16)
 return true;
   if (index_code == REG)
 return (regno == HL_REG);


Re: FW: [PATCH] [MIPS] microMIPS gcc support

2013-01-28 Thread Richard Sandiford
"Maciej W. Rozycki"  writes:
> On Sat, 26 Jan 2013, Richard Sandiford wrote:
>
>> >  How about instead of complicating this we simply add support for 
>> > microMIPS encoding in the PLT?  I think I should be able to squeeze out 
>> > some time next week to dust off and retest the binutils patch I've had 
>> > pending far too long now.  This way we won't have to maintain separate 
>> > cases where tail calls may or may not be made via the PLT.
>> >
>> >  Note that we need that support sooner or later anyway due to the prospect 
>> > of pure-microMIPS processors.
>> 
>> Just so I know: what does the PLT patch do for external functions
>> that are jumped to by both microMIPS and non-microMIPS code?
>
>  Two PLT entries are produced in that case.
>
>  PLT entries are created based on the relocation type referring: R_MIPS_26 
> relocations trigger a standard MIPS PLT entry, R_MICROMIPS_26_S1 
> relocations trigger a microMIPS PLT entry.  Other relocations reuse a PLT 
> entry already produced for one of the jump relocations, or if none 
> present, then they make an own PLT entry according to ELF file header 
> flags: if EF_MIPS_ARCH_ASE_MICROMIPS is set, then a microMIPS entry is 
> produced, otherwise a standard MIPS one.  Therefore depending on 
> relocations seen up to two entries can be produced, encoded differently so 
> that there is no need to switch modes with direct jumps.
>
>  If all the individual PLT entries ultimately produced are microMIPS code, 
> then the PLT header is built as microMIPS code as well, otherwise it's 
> standard MIPS code.  This guarantees no standard MIPS code is produced in 
> the PLT if there's none already in the executable (and vice versa).

Thanks, sounds good!  In that case, yeah, let's leave the TARGET_ABICALLS_PIC0
part out (but keep the rest of mips_call_may_need_jalx_p).

For avoidance of doubt: I don't think there's any need to wait for the
linker patches before sending the updated GCC patch.  The GCC patch can
only go in 4.9 anyway, and the new PLT code won't be avaiable until 2.24,
so there's plenty of time on both sides.  Testing the GCC patch against
Mentor's linker is fine with me.

Richard


Re: [Patch] Fix PR54814

2013-01-28 Thread Richard Biener
On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher  wrote:
> On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote:
> The patch was originally worked out by Bernd Schmidt and fixed a problem
> introduced in
>
> http://gcc.gnu.org/r190252
>
> Ironically, this revision fixes a reload problem on x86/x86_64 --
> which doesn't use reload anymore now...
>
>
>> Does this mean the fix is rejected for 4.8?
>
> No, just that it probably helps to add a RM to the CC list.
>
> FWIW, it seems to me that this patch should go into 4.8, because the
> bug is probably not limited to AVR.

Indeed, the fix also looks quite obvious though I know nothing about the
code at all.

Thus, ok from a RM perspective if a reload-affine person approves it.

Thanks,
Richard.

> Ciao!
> Steven


Re: Cortex-A15 vfnma/vfnms test patch

2013-01-28 Thread Ramana Radhakrishnan



[Taking gcc-help off this thread.]

Amol,


I have tested these instruction with GCC and these instructions are generated.
Please review and marge this test support patch in gcc main trunk.


Thanks for this patch and sorry about the delay in getting around to this.

This is ok and I'll take this under the 10 line rule this time .

If you intend to continue to submit patches to gcc can I ask that you 
start the process for copyright assignments or confirm that you have a 
copyright assignment on file ?


http://gcc.gnu.org/contribute.html#legal

If you don't, send an email to g...@gcc.gnu.org with a request for 
copyright assignment papers and a maintainer will send you these.


http://gcc.gnu.org/contribute.html in general is a good summary of the 
process related to contributing patches to GCC in general . Please do 
read that and follow up on g...@gcc.gnu.org if you have any more questions.


And finally don't forget to add a changelog to your patches as 
documented in links from the above mentioned page. Since this is your 
first time I've added the following Changelog entry for your patch and 
applied it.


regards
Ramana



2013-01-27  Amol Pise  

* gcc.target/arm/neon-vfnms-1.c: New test.
* gcc.target/arm/neon-vfnma-1.c: New test.






Re: [avr,committed] Fix fixed-point conversion

2013-01-28 Thread Georg-Johann Lay
Gerald Pfeifer wrote:
> On Thu, 24 Jan 2013, Georg-Johann Lay wrote:
>> Committed the following change:
>>
>> http://gcc.gnu.org/r195424
>>
>>  * config/avr/avr.c (avr_out_fract): Make register numbers that
>>  might be outside of source operand signed.
> 
> Can you still post patches to the list, and not just the reference?
> 
> Thanks,
> Gerald


Thinks for pointing this out.  I will follow the guideline in the future.

Here is the change:

Index: config/avr/avr.c
===
--- config/avr/avr.c(revision 195423)
+++ config/avr/avr.c(revision 195424)
@@ -7114,13 +7114,13 @@ avr_out_fract (rtx insn, rtx operands[],
   unsigned d1 = d0 + step;

   // Current and next regno of source
-  unsigned s0 = d0 - offset;
-  unsigned s1 = s0 + step;
+  signed s0 = d0 - offset;
+  signed s1 = s0 + step;

   // Must current resp. next regno be CLRed?  This applies to the low
   // bytes of the destination that have no associated source bytes.
-  bool clr0 = s0 < src.regno;
-  bool clr1 = s1 < src.regno && d1 >= dest.regno;
+  bool clr0 = s0 < (signed) src.regno;
+  bool clr1 = s1 < (signed) src.regno && d1 >= dest.regno;

   // First gather what code to emit (if any) and additional step to
   // apply if a MOVW is in use.  xop[2] is destination rtx and xop[3]
@@ -7150,12 +7150,12 @@ avr_out_fract (rtx insn, rtx operands[],
 }
 }
 }
-  else if (offset && s0 <= src.regno_msb)
+  else if (offset && s0 <= (signed) src.regno_msb)
 {
   int movw = AVR_HAVE_MOVW && offset % 2 == 0
 && d0 % 2 == (offset > 0)
 && d1 <= dest.regno_msb && d1 >= dest.regno
-&& s1 <= src.regno_msb  && s1 >= src.regno;
+&& s1 <= (signed) src.regno_msb  && s1 >= (signed) src.regno;

   xop[2] = all_regs_rtx[d0 & ~movw];
   xop[3] = all_regs_rtx[s0 & ~movw];



Re: [Patch] Fix PR54814

2013-01-28 Thread Ulrich Weigand
Richard Biener wrote:
> On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher  
> wrote:
> > On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote:
> > The patch was originally worked out by Bernd Schmidt and fixed a problem
> > introduced in
> >
> > http://gcc.gnu.org/r190252
> >
> > Ironically, this revision fixes a reload problem on x86/x86_64 --
> > which doesn't use reload anymore now...
> >
> >
> >> Does this mean the fix is rejected for 4.8?
> >
> > No, just that it probably helps to add a RM to the CC list.
> >
> > FWIW, it seems to me that this patch should go into 4.8, because the
> > bug is probably not limited to AVR.
> 
> Indeed, the fix also looks quite obvious though I know nothing about the
> code at all.
> 
> Thus, ok from a RM perspective if a reload-affine person approves it.

The patch was originally by Bernd, but FWIW it looks good to me as well.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: Cortex-A15 vfnma/vfnms test patch

2013-01-28 Thread amol pise
Dear Ramana,

Thank You very much for the changelog and commit of my patch in gcc.
I will follow the steps mentioned by you.

Thank You,
Amol Pise


On Mon, Jan 28, 2013 at 4:18 PM, Ramana Radhakrishnan  wrote:
>
>
> [Taking gcc-help off this thread.]
>
> Amol,
>
>
>> I have tested these instruction with GCC and these instructions are
>> generated.
>> Please review and marge this test support patch in gcc main trunk.
>
>
> Thanks for this patch and sorry about the delay in getting around to this.
>
> This is ok and I'll take this under the 10 line rule this time .
>
> If you intend to continue to submit patches to gcc can I ask that you start
> the process for copyright assignments or confirm that you have a copyright
> assignment on file ?
>
> http://gcc.gnu.org/contribute.html#legal
>
> If you don't, send an email to g...@gcc.gnu.org with a request for copyright
> assignment papers and a maintainer will send you these.
>
> http://gcc.gnu.org/contribute.html in general is a good summary of the
> process related to contributing patches to GCC in general . Please do read
> that and follow up on g...@gcc.gnu.org if you have any more questions.
>
> And finally don't forget to add a changelog to your patches as documented in
> links from the above mentioned page. Since this is your first time I've
> added the following Changelog entry for your patch and applied it.
>
> regards
> Ramana
>
>
>
> 2013-01-27  Amol Pise  
>
> * gcc.target/arm/neon-vfnms-1.c: New test.
> * gcc.target/arm/neon-vfnma-1.c: New test.
>
>
>
>


[PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)

2013-01-28 Thread Jakub Jelinek
Hi!

We ICE on the following testcase when using cselib, because
cselib_lookup* is never called on the PREFETCH argument, and
add_insn_mem_dependence calls cselib_subst_to_values on it, which
assumes cselib_lookup* already happened on it earlier.
For MEMs sched_analyze_2 calls cselib_lookup_from_insn, but for PREFETCHes
it didn't.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-01-28  Jakub Jelinek  

PR rtl-optimization/56117
* sched-deps.c (sched_analyze_2) : For use_cselib
call cselib_lookup_from_insn on the MEM before calling
add_insn_mem_dependence.

* gcc.dg/pr56117.c: New test.

--- gcc/sched-deps.c.jj 2013-01-16 19:58:42.0 +0100
+++ gcc/sched-deps.c2013-01-28 09:43:33.248657691 +0100
@@ -2720,8 +2720,12 @@ sched_analyze_2 (struct deps_desc *deps,
 prefetch has only the start address but it is better to have
 something than nothing.  */
   if (!deps->readonly)
-   add_insn_mem_dependence (deps, true, insn,
-gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0)));
+   {
+ rtx x = gen_rtx_MEM (Pmode, XEXP (PATTERN (insn), 0));
+ if (sched_deps_info->use_cselib)
+   cselib_lookup_from_insn (x, Pmode, true, VOIDmode, insn);
+ add_insn_mem_dependence (deps, true, insn, x);
+   }
   break;
 
 case UNSPEC_VOLATILE:
--- gcc/testsuite/gcc.dg/pr56117.c.jj   2013-01-28 09:47:21.244381559 +0100
+++ gcc/testsuite/gcc.dg/pr56117.c  2013-01-28 09:46:31.0 +0100
@@ -0,0 +1,9 @@
+/* PR rtl-optimization/56117 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsched2-use-superblocks" } */
+
+void
+foo (void *p)
+{
+  __builtin_prefetch (p);
+}

Jakub


[PATCH][RFC] Avoid excessive BLOCK associations for locations

2013-01-28 Thread Richard Biener

This avoids assigning BLOCKs to things that didn't have one before
(originally I observed that the code snippets below happily generate
a UNKNOWN_LOCATION, id->block association).  A previous patch last
year changed expansion in a way to not jump back to the outermost
block when observing a NULL LOCATION_BLOCK in the IL, but similar
to UNKNOWN_LOCATION locus handling just inherit the currently active
BLOCK.  Thus the patch below, instead of just avoiding the non-sensical
UNKNOWN_LOCATION, id->block association goes one step further and
never puts things in the outermost inline BLOCK if it didn't have
a BLOCK assigned before.  This avoids the original non-sensical
issue and avoids excessive BLOCK associations where they are of not
much use.

What's the point of switching to the outermost scope for unknown-BLOCK
locations?  Isn't inheriting the currently active scope much more
useful (it definitely is for UNKNOWN_LOCATIONs)?  If we have a
non-UNKNOWN_LOCATION, would a NULL BLOCK not be an error anyway?
An error we "hide" in the current scheme?

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Does this make sense?

Thanks,
Richard.

2013-01-28  Richard Biener  

* tree-inline.c (remap_gimple_stmt): Do not assing a BLOCK
to a stmt that didn't have one.
(copy_phis_for_bb): Likewise for PHI arguments.
(copy_debug_stmt): Likewise for debug stmts.

Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 195502)
+++ gcc/tree-inline.c   (working copy)
@@ -1198,7 +1198,6 @@ remap_gimple_stmt (gimple stmt, copy_bod
 {
   gimple copy = NULL;
   struct walk_stmt_info wi;
-  tree new_block;
   bool skip_first = false;
 
   /* Begin by recognizing trees that we'll completely rewrite for the
@@ -1458,19 +1457,15 @@ remap_gimple_stmt (gimple stmt, copy_bod
 }
 
   /* If STMT has a block defined, map it to the newly constructed
- block.  When inlining we want statements without a block to
- appear in the block of the function call.  */
-  new_block = id->block;
+ block.  */
   if (gimple_block (copy))
 {
   tree *n;
   n = (tree *) pointer_map_contains (id->decl_map, gimple_block (copy));
   gcc_assert (n);
-  new_block = *n;
+  gimple_set_block (copy, *n);
 }
 
-  gimple_set_block (copy, new_block);
-
   if (gimple_debug_bind_p (copy) || gimple_debug_source_bind_p (copy))
 return copy;
 
@@ -1987,7 +1982,6 @@ copy_phis_for_bb (basic_block bb, copy_b
  edge old_edge = find_edge ((basic_block) new_edge->src->aux, bb);
  tree arg;
  tree new_arg;
- tree block = id->block;
  edge_iterator ei2;
  location_t locus;
 
@@ -2015,19 +2009,18 @@ copy_phis_for_bb (basic_block bb, copy_b
  inserted = true;
}
  locus = gimple_phi_arg_location_from_edge (phi, old_edge);
- block = id->block;
  if (LOCATION_BLOCK (locus))
{
  tree *n;
  n = (tree *) pointer_map_contains (id->decl_map,
LOCATION_BLOCK (locus));
  gcc_assert (n);
- block = *n;
+ locus = COMBINE_LOCATION_DATA (line_table, locus, *n);
}
+ else
+   locus = LOCATION_LOCUS (locus);
 
- add_phi_arg (new_phi, new_arg, new_edge, block ?
- COMBINE_LOCATION_DATA (line_table, locus, block) :
- LOCATION_LOCUS (locus));
+ add_phi_arg (new_phi, new_arg, new_edge, locus);
}
}
 }
@@ -2324,14 +2317,11 @@ copy_debug_stmt (gimple stmt, copy_body_
   tree t, *n;
   struct walk_stmt_info wi;
 
-  t = id->block;
   if (gimple_block (stmt))
 {
   n = (tree *) pointer_map_contains (id->decl_map, gimple_block (stmt));
-  if (n)
-   t = *n;
+  gimple_set_block (stmt, n ? *n : id->block);
 }
-  gimple_set_block (stmt, t);
 
   /* Remap all the operands in COPY.  */
   memset (&wi, 0, sizeof (wi));


[committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)

2013-01-28 Thread Jakub Jelinek
Hi!

As discussed in the PR, this is a safer variant of a fix for 4.8, where
input_location during most optimization passes is set to
DECL_SOURCE_LOCATION (current_function_decl) and various parts of the
gimplifier e.g. during force_gimple_operand* may end up setting
gimple_location to that.  For 4.9, we should revert this and set
input_location to UNKNOWN_LOCATION for the optimizers.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2013-01-28  Jakub Jelinek  

PR tree-optimization/56094
* gimplify.c (force_gimple_operand_1): Temporarily set input_location
to UNKNOWN_LOCATION while gimplifying expr.

* gcc.dg/pr56094.c: New test.

--- gcc/gimplify.c.jj   2013-01-25 21:02:45.0 +0100
+++ gcc/gimplify.c  2013-01-28 11:34:15.671374132 +0100
@@ -8600,6 +8600,7 @@ force_gimple_operand_1 (tree expr, gimpl
 {
   enum gimplify_status ret;
   struct gimplify_ctx gctx;
+  location_t saved_location;
 
   *stmts = NULL;
 
@@ -8613,6 +8614,8 @@ force_gimple_operand_1 (tree expr, gimpl
   push_gimplify_context (&gctx);
   gimplify_ctxp->into_ssa = gimple_in_ssa_p (cfun);
   gimplify_ctxp->allow_rhs_cond_expr = true;
+  saved_location = input_location;
+  input_location = UNKNOWN_LOCATION;
 
   if (var)
 {
@@ -8634,6 +8637,7 @@ force_gimple_operand_1 (tree expr, gimpl
   gcc_assert (ret != GS_ERROR);
 }
 
+  input_location = saved_location;
   pop_gimplify_context (NULL);
 
   return expr;
--- gcc/testsuite/gcc.dg/pr56094.c.jj   2013-01-28 11:46:09.045221238 +0100
+++ gcc/testsuite/gcc.dg/pr56094.c  2013-01-28 11:47:54.052611852 +0100
@@ -0,0 +1,81 @@
+/* PR tree-optimization/56094 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g -fdump-tree-optimized-lineno" } */
+
+_Bool cond;
+
+int
+fn0 (unsigned char, unsigned long long, unsigned char,
+ unsigned char, signed short, unsigned int,
+ unsigned char *);
+
+extern void fn3 (unsigned char, unsigned char, unsigned char, unsigned char,
+unsigned char, unsigned char, unsigned char, unsigned short);
+extern void fn7 (int);
+extern void fn8 (int);
+
+static __inline__ __attribute__ ((always_inline)) void
+fn1 (unsigned char arg0, unsigned char arg1, unsigned char arg2,
+ unsigned char arg3, unsigned char arg4, unsigned char arg5,
+ unsigned short arg6)
+{
+  asm volatile ("" :: "g" ((unsigned long long) arg0), "g" (arg1),
+ "g" (arg2), "g" (arg3), "g" (arg4), "g" (arg5),
+ "g" (arg6));
+  if (cond)
+{
+  unsigned char loc0 = 0;
+  fn3 (loc0, arg0, arg1, arg2, arg3, arg4, arg5, arg6);
+}
+}
+
+static __inline__ __attribute__ ((always_inline)) void
+fn4 (unsigned int arg0, unsigned long long arg1)
+{
+  asm volatile ("" :: "g" (arg0), "g" (arg1));
+}
+
+static __inline__ __attribute__ ((always_inline)) void
+fn5 (unsigned int arg0, unsigned char arg1, unsigned int arg2,
+ unsigned char arg3)
+{
+  asm volatile ("" :: "g" (arg0), "g" (arg1),
+ "g" ((unsigned long long) arg2), "g" (arg3));
+}
+
+static __inline__ __attribute__ ((always_inline)) void
+fn6 (unsigned long long arg0, unsigned char arg1,
+ unsigned char arg2, signed short arg3,
+ unsigned int arg4, unsigned char * arg5)
+{
+  asm volatile ("" :: "g" (arg0), "g" ((unsigned long long) arg1),
+ "g" ((unsigned long long) arg2), "g" (arg3),
+ "g" (arg4), "g" (arg5));
+  if (cond)
+{
+  unsigned char loc0 = 0;
+  fn0 (loc0, arg0, arg1, arg2, arg3, arg4, arg5);
+}
+}
+
+unsigned char b[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0xa };
+unsigned int q = sizeof (b) / sizeof (b[0]);
+
+void
+foo ()
+{
+  int i;
+  for (i = 1; i <= 50; i++)
+{
+  fn6 (i + 0x1234, i + 1, i + 0xa, i + 0x1234, q, b);
+  fn5 (i + 0xabcd, i << 1, i + 0x1234, i << 2);
+  fn7 (i + 0xdead);
+  fn8 (i + 0xdead);
+  fn1 (i, i + 1, i + 2, i + 3, i + 4, i + 5, i << 10);
+  fn4 (i + 0xfeed, i);
+}
+}
+
+/* Verify no statements get the location of the foo () decl.  */
+/* { dg-final { scan-tree-dump-not " : 65:1\\\]" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Jakub


[PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Jakub Jelinek
Hi!

gimple_expand_builtin_pow last two optimizations rely on earlier
optimizations in the same function to be performed, e.g.
folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only
correct for c which isn't an integer (otherwise the sqrt(x) factor would
need to be skipped), but they actually do not check this.
E.g. the pow (x, n) where n is integer is optimized only if:
  && ((n >= -1 && n <= 2)
  || (flag_unsafe_math_optimizations
  && optimize_insn_for_speed_p ()
  && powi_cost (n) <= POWI_MAX_MULTS)))
and as in the testcase the function is called, it isn't optimized and
we fall through till the above mentioned optimization which blindly assumes
that c isn't an integer.

Fixed by both checking that c isn't an integer (and for the last
optimization also that 2c isn't an integer), and also not doing the
-> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2)
optimization for -Os or cold functions, at least
__attribute__((cold)) double
foo (double x, double n)
{
  return __builtin_pow (x, -1.5);
}
is smaller when expanded as pow call both on x86_64 and on powerpc (with
-Os -ffast-math).  Even just the c*_is_int tests alone could be enough
to fix the bug, so if you say want to enable it for -Os even with c 1.5,
but not for negative values which add another operation, it can be adjusted.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-01-28  Jakub Jelinek  

PR tree-optimization/56125
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
pow(x,c) into sqrt(x) * powi(x, n/2) or
1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
optimizing for size.
Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
integer.

* gcc.dg/pr56125.c: New test.

--- gcc/tree-ssa-math-opts.c.jj 2013-01-11 09:02:48.0 +0100
+++ gcc/tree-ssa-math-opts.c2013-01-28 10:56:40.105950483 +0100
@@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
   HOST_WIDE_INT n;
   tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, powi_cbrt_x;
   enum machine_mode mode;
-  bool hw_sqrt_exists;
+  bool hw_sqrt_exists, c_is_int, c2_is_int;
 
   /* If the exponent isn't a constant, there's nothing of interest
  to be done.  */
@@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i
   c = TREE_REAL_CST (arg1);
   n = real_to_integer (&c);
   real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
+  c_is_int = real_identical (&c, &cint);
 
-  if (real_identical (&c, &cint)
+  if (c_is_int
   && ((n >= -1 && n <= 2)
  || (flag_unsafe_math_optimizations
  && optimize_insn_for_speed_p ()
@@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i
   return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0);
 }
 
-  /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into
+  /* Optimize pow(x,c), where n = 2c for some nonzero integer n
+ and c not an integer, into
 
sqrt(x) * powi(x, n/2),n > 0;
1.0 / (sqrt(x) * powi(x, abs(n/2))),   n < 0.
@@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i
   real_arithmetic (&c2, MULT_EXPR, &c, &dconst2);
   n = real_to_integer (&c2);
   real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
+  c2_is_int = real_identical (&c2, &cint);
 
   if (flag_unsafe_math_optimizations
   && sqrtfn
-  && real_identical (&c2, &cint))
+  && c2_is_int
+  && !c_is_int
+  && optimize_function_for_speed_p (cfun))
 {
   tree powi_x_ndiv2 = NULL_TREE;
 
@@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
   && cbrtfn
   && (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
   && real_identical (&c2, &c)
+  && !c2_is_int
   && optimize_function_for_speed_p (cfun)
   && powi_cost (n / 3) <= POWI_MAX_MULTS)
 {
--- gcc/testsuite/gcc.dg/pr56125.c.jj   2013-01-28 11:00:04.359814742 +0100
+++ gcc/testsuite/gcc.dg/pr56125.c  2013-01-28 11:00:55.048532118 +0100
@@ -0,0 +1,21 @@
+/* PR tree-optimization/56125 */
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math" } */
+
+extern void abort (void);
+extern double fabs (double);
+
+__attribute__((cold)) double
+foo (double x, double n)
+{
+  double u = x / (n * n);
+  return u;
+}
+
+int
+main ()
+{
+  if (fabs (foo (29, 2) - 7.25) > 0.001)
+abort ();
+  return 0;
+}
 
Jakub


[committed] Avoid string.h includes in -fno-builtin-memset testcases (PR testsuite/56053)

2013-01-28 Thread Jakub Jelinek
Hi!

Some targets apparently force fortification unconditionally or at least by
default, when string.h is then included, memset etc. inlines might call
__builtin_memset or __builtin___memset_chk directly and for explicit builtin
uses -fno-builtin* doesn't work.  Fixed by avoiding those includes and
instead adding needed prototypes by hand, tested on x86_64-linux, committed
as obvious to trunk.

2013-01-28  Jakub Jelinek  

PR testsuite/56053
* c-c++-common/asan/heap-overflow-1.c: Don't include stdlib.h and
string.h.  Provide memset, malloc and free prototypes, adjust line
numbers in dg-output.
* c-c++-common/asan/stack-overflow-1.c: Don't include string.h.
Provide memset prototype and adjust line numbers in dg-output.
* c-c++-common/asan/global-overflow-1.c: Likewise.

--- gcc/testsuite/c-c++-common/asan/heap-overflow-1.c.jj2012-12-13 
00:02:50.0 +0100
+++ gcc/testsuite/c-c++-common/asan/heap-overflow-1.c   2013-01-28 
13:47:58.682416114 +0100
@@ -2,8 +2,18 @@
 /* { dg-options "-fno-builtin-malloc -fno-builtin-free -fno-builtin-memset" } 
*/
 /* { dg-shouldfail "asan" } */
 
-#include 
-#include 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void *memset (void *, int, __SIZE_TYPE__);
+void *malloc (__SIZE_TYPE__);
+void free (void *);
+
+#ifdef __cplusplus
+}
+#endif
+
 volatile int ten = 10;
 int main(int argc, char **argv) {
   char *x = (char*)malloc(10);
@@ -14,8 +24,8 @@ int main(int argc, char **argv) {
 }
 
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*heap-overflow-1.c:11|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
 /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of 10-byte 
region\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "allocated by thread T0 here:\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "#0 0x\[0-9a-f\]+ (in 
_*(interceptor_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*heap-overflow-1.c:9|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#1 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
--- gcc/testsuite/c-c++-common/asan/stack-overflow-1.c.jj   2012-12-13 
00:02:50.0 +0100
+++ gcc/testsuite/c-c++-common/asan/stack-overflow-1.c  2013-01-28 
13:48:41.046171347 +0100
@@ -2,9 +2,13 @@
 /* { dg-options "-fno-builtin-memset" } */
 /* { dg-shouldfail "asan" } */
 
-volatile int ten = 10;
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void *memset (void *, int, __SIZE_TYPE__);
 
-#include 
+volatile int ten = 10;
 
 int main() {
   char x[10];
@@ -14,5 +18,5 @@ int main() {
 }
 
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread 
T0\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*stack-overflow-1.c:12|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*stack-overflow-1.c:16|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
 /* { dg-output "Address 0x\[0-9a-f\]+ is\[^\n\r]*frame " } */
--- gcc/testsuite/c-c++-common/asan/global-overflow-1.c.jj  2012-12-13 
00:02:50.0 +0100
+++ gcc/testsuite/c-c++-common/asan/global-overflow-1.c 2013-01-28 
13:46:13.900017787 +0100
@@ -2,7 +2,12 @@
 /* { dg-options "-fno-builtin-memset" } */
 /* { dg-shouldfail "asan" } */
 
-#include 
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void *memset (void *, int, __SIZE_TYPE__);
+
 volatile int ten = 10;
 
 int main() {
@@ -18,6 +23,6 @@ int main() {
 }
 
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*global-overflow-1.c:15|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } 
*/
+/* { dg-output "#0 0x\[0-9a-f\]+ (in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } 
*/
 /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of global 
variable" } */
 /* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */

Jakub


Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Richard Biener
On Mon, 28 Jan 2013, Jakub Jelinek wrote:

> Hi!
> 
> gimple_expand_builtin_pow last two optimizations rely on earlier
> optimizations in the same function to be performed, e.g.
> folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only
> correct for c which isn't an integer (otherwise the sqrt(x) factor would
> need to be skipped), but they actually do not check this.
> E.g. the pow (x, n) where n is integer is optimized only if:
>   && ((n >= -1 && n <= 2)
>   || (flag_unsafe_math_optimizations
>   && optimize_insn_for_speed_p ()
>   && powi_cost (n) <= POWI_MAX_MULTS)))
> and as in the testcase the function is called, it isn't optimized and
> we fall through till the above mentioned optimization which blindly assumes
> that c isn't an integer.
> 
> Fixed by both checking that c isn't an integer (and for the last
> optimization also that 2c isn't an integer), and also not doing the
> -> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2)
> optimization for -Os or cold functions, at least
> __attribute__((cold)) double
> foo (double x, double n)
> {
>   return __builtin_pow (x, -1.5);
> }
> is smaller when expanded as pow call both on x86_64 and on powerpc (with
> -Os -ffast-math).  Even just the c*_is_int tests alone could be enough
> to fix the bug, so if you say want to enable it for -Os even with c 1.5,
> but not for negative values which add another operation, it can be adjusted.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2013-01-28  Jakub Jelinek  
> 
>   PR tree-optimization/56125
>   * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
>   pow(x,c) into sqrt(x) * powi(x, n/2) or
>   1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
>   optimizing for size.
>   Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
>   1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
>   integer.
> 
>   * gcc.dg/pr56125.c: New test.
> 
> --- gcc/tree-ssa-math-opts.c.jj   2013-01-11 09:02:48.0 +0100
> +++ gcc/tree-ssa-math-opts.c  2013-01-28 10:56:40.105950483 +0100
> @@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
>HOST_WIDE_INT n;
>tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, 
> powi_cbrt_x;
>enum machine_mode mode;
> -  bool hw_sqrt_exists;
> +  bool hw_sqrt_exists, c_is_int, c2_is_int;
>  
>/* If the exponent isn't a constant, there's nothing of interest
>   to be done.  */
> @@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i
>c = TREE_REAL_CST (arg1);
>n = real_to_integer (&c);
>real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
> +  c_is_int = real_identical (&c, &cint);
>  
> -  if (real_identical (&c, &cint)
> +  if (c_is_int
>&& ((n >= -1 && n <= 2)
> || (flag_unsafe_math_optimizations
> && optimize_insn_for_speed_p ()
> @@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i
>return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0);
>  }
>  
> -  /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into
> +  /* Optimize pow(x,c), where n = 2c for some nonzero integer n
> + and c not an integer, into
>  
> sqrt(x) * powi(x, n/2),n > 0;
> 1.0 / (sqrt(x) * powi(x, abs(n/2))),   n < 0.
> @@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i
>real_arithmetic (&c2, MULT_EXPR, &c, &dconst2);
>n = real_to_integer (&c2);
>real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
> +  c2_is_int = real_identical (&c2, &cint);
>  
>if (flag_unsafe_math_optimizations
>&& sqrtfn
> -  && real_identical (&c2, &cint))
> +  && c2_is_int
> +  && !c_is_int
> +  && optimize_function_for_speed_p (cfun))
>  {
>tree powi_x_ndiv2 = NULL_TREE;
>  
> @@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
>&& cbrtfn
>&& (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
>&& real_identical (&c2, &c)
> +  && !c2_is_int
>&& optimize_function_for_speed_p (cfun)
>&& powi_cost (n / 3) <= POWI_MAX_MULTS)
>  {
> --- gcc/testsuite/gcc.dg/pr56125.c.jj 2013-01-28 11:00:04.359814742 +0100
> +++ gcc/testsuite/gcc.dg/pr56125.c2013-01-28 11:00:55.048532118 +0100
> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/56125 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math" } */
> +
> +extern void abort (void);
> +extern double fabs (double);
> +
> +__attribute__((cold)) double
> +foo (double x, double n)
> +{
> +  double u = x / (n * n);
> +  return u;
> +}
> +
> +int
> +main ()
> +{
> +  if (fabs (foo (29, 2) - 7.25) > 0.001)
> +abort ();
> +  return 0;
> +}
>  
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Fel

Re: [PATCH][RFC] Avoid excessive BLOCK associations for locations

2013-01-28 Thread Jakub Jelinek
On Mon, Jan 28, 2013 at 03:15:53PM +0100, Richard Biener wrote:
> Does this make sense?

Yes.  Wouldn't hurt to run GDB testsuite with that, though I bet most of it
is -O0 anyway and thus won't stress it out too much.

> 2013-01-28  Richard Biener  
> 
>   * tree-inline.c (remap_gimple_stmt): Do not assing a BLOCK
>   to a stmt that didn't have one.
>   (copy_phis_for_bb): Likewise for PHI arguments.
>   (copy_debug_stmt): Likewise for debug stmts.

Ok.

Jakub


[PATCH] Fix PR56034

2013-01-28 Thread Richard Biener

The following implements what I thought was present (eh ...).  For
partitions that contain reductions (feed loop closed PHI nodes) we
rely on them being in the last partition of the loop - as we do not
bother to copy / care for loop closed PHI nodes.  The following now
implements that fully (I've had a patch for that around), by both
seeding initial partition generation from scalar reductions and
taking care of merging them again into the very last partition of
the loop.

Completely disabling partitioning for loops with reductions would
have broken some existing testcases that happen to work because
for them the partitions are already in proper order.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-01-28  Richard Biener  

PR tree-optimization/56034
* tree-loop-distribution.c (enum partition_kind): Add
PKIND_REDUCTION.
(partition_builtin_p): Adjust.
(generate_code_for_partition): Handle PKIND_REDUCTION.  Assert
it is the last partition.
(rdg_flag_uses): Check SSA_NAME_IS_DEFAULT_DEF before looking
up the vertex for the definition.
(classify_partition): Classify whether a partition is a
PKIND_REDUCTION, thus has uses outside of the loop.
(ldist_gen): Inherit PKIND_REDUCTION when merging partitions.
Merge all PKIND_REDUCTION partitions into the last partition.
(tree_loop_distribution): Seed partitions from reductions as well.

* gcc.dg/torture/pr56034.c: New testcase.

Index: gcc/tree-loop-distribution.c
===
*** gcc/tree-loop-distribution.c(revision 195502)
--- gcc/tree-loop-distribution.c(working copy)
*** along with GCC; see the file COPYING3.
*** 51,57 
  #include "tree-scalar-evolution.h"
  #include "tree-pass.h"
  
! enum partition_kind { PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY };
  
  typedef struct partition_s
  {
--- 51,59 
  #include "tree-scalar-evolution.h"
  #include "tree-pass.h"
  
! enum partition_kind {
! PKIND_NORMAL, PKIND_REDUCTION, PKIND_MEMSET, PKIND_MEMCPY
! };
  
  typedef struct partition_s
  {
*** partition_free (partition_t partition)
*** 90,96 
  static bool
  partition_builtin_p (partition_t partition)
  {
!   return partition->kind != PKIND_NORMAL;
  }
  
  /* Returns true if the partition has an writes.  */
--- 92,98 
  static bool
  partition_builtin_p (partition_t partition)
  {
!   return partition->kind > PKIND_REDUCTION;
  }
  
  /* Returns true if the partition has an writes.  */
*** generate_code_for_partition (struct loop
*** 481,486 
--- 483,491 
destroy_loop (loop);
break;
  
+ case PKIND_REDUCTION:
+   /* Reductions all have to be in the last partition.  */
+   gcc_assert (!copy_p);
  case PKIND_NORMAL:
generate_loops_for_partition (loop, partition, copy_p);
break;
*** rdg_flag_uses (struct graph *rdg, int u,
*** 628,634 
{
  tree use = USE_FROM_PTR (use_p);
  
! if (TREE_CODE (use) == SSA_NAME)
{
  gimple def_stmt = SSA_NAME_DEF_STMT (use);
  int v = rdg_vertex_for_stmt (rdg, def_stmt);
--- 633,640 
{
  tree use = USE_FROM_PTR (use_p);
  
! if (TREE_CODE (use) == SSA_NAME
! && !SSA_NAME_IS_DEFAULT_DEF (use))
{
  gimple def_stmt = SSA_NAME_DEF_STMT (use);
  int v = rdg_vertex_for_stmt (rdg, def_stmt);
*** classify_partition (loop_p loop, struct
*** 858,882 
unsigned i;
tree nb_iter;
data_reference_p single_load, single_store;
  
partition->kind = PKIND_NORMAL;
partition->main_dr = NULL;
partition->secondary_dr = NULL;
  
-   if (!flag_tree_loop_distribute_patterns)
- return;
- 
-   /* Perform general partition disqualification for builtins.  */
-   nb_iter = number_of_exit_cond_executions (loop);
-   if (!nb_iter || nb_iter == chrec_dont_know)
- return;
- 
EXECUTE_IF_SET_IN_BITMAP (partition->stmts, 0, i, bi)
  {
gimple stmt = RDG_STMT (rdg, i);
  
if (gimple_has_volatile_ops (stmt))
!   return;
  
/* If the stmt has uses outside of the loop fail.
 ???  If the stmt is generated in another partition that
--- 864,881 
unsigned i;
tree nb_iter;
data_reference_p single_load, single_store;
+   bool volatiles_p = false;
  
partition->kind = PKIND_NORMAL;
partition->main_dr = NULL;
partition->secondary_dr = NULL;
  
EXECUTE_IF_SET_IN_BITMAP (partition->stmts, 0, i, bi)
  {
gimple stmt = RDG_STMT (rdg, i);
  
if (gimple_has_volatile_ops (stmt))
!   volatiles_p = true;
  
/* If the stmt has uses outside of the loop fail.
 ???  If the stmt is generated in another partition that
*** classify_partition (loop_p loop, struct
*** 

Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Bill Schmidt
LGTM!  Thanks for fixing this.

Bill

On Mon, 2013-01-28 at 15:25 +0100, Jakub Jelinek wrote:
> Hi!
> 
> gimple_expand_builtin_pow last two optimizations rely on earlier
> optimizations in the same function to be performed, e.g.
> folding pow (x, c) for n = 2c into sqrt(x) * powi(x, n / 2) is only
> correct for c which isn't an integer (otherwise the sqrt(x) factor would
> need to be skipped), but they actually do not check this.
> E.g. the pow (x, n) where n is integer is optimized only if:
>   && ((n >= -1 && n <= 2)
>   || (flag_unsafe_math_optimizations
>   && optimize_insn_for_speed_p ()
>   && powi_cost (n) <= POWI_MAX_MULTS)))
> and as in the testcase the function is called, it isn't optimized and
> we fall through till the above mentioned optimization which blindly assumes
> that c isn't an integer.
> 
> Fixed by both checking that c isn't an integer (and for the last
> optimization also that 2c isn't an integer), and also not doing the
> -> sqrt(x) * powi(x, n / 2) resp. 1.0 / sqrt(x) * powi(x, abs(n) / 2)
> optimization for -Os or cold functions, at least
> __attribute__((cold)) double
> foo (double x, double n)
> {
>   return __builtin_pow (x, -1.5);
> }
> is smaller when expanded as pow call both on x86_64 and on powerpc (with
> -Os -ffast-math).  Even just the c*_is_int tests alone could be enough
> to fix the bug, so if you say want to enable it for -Os even with c 1.5,
> but not for negative values which add another operation, it can be adjusted.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2013-01-28  Jakub Jelinek  
> 
>   PR tree-optimization/56125
>   * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
>   pow(x,c) into sqrt(x) * powi(x, n/2) or
>   1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
>   optimizing for size.
>   Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
>   1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
>   integer.
> 
>   * gcc.dg/pr56125.c: New test.
> 
> --- gcc/tree-ssa-math-opts.c.jj   2013-01-11 09:02:48.0 +0100
> +++ gcc/tree-ssa-math-opts.c  2013-01-28 10:56:40.105950483 +0100
> @@ -1110,7 +1110,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
>HOST_WIDE_INT n;
>tree type, sqrtfn, cbrtfn, sqrt_arg0, sqrt_sqrt, result, cbrt_x, 
> powi_cbrt_x;
>enum machine_mode mode;
> -  bool hw_sqrt_exists;
> +  bool hw_sqrt_exists, c_is_int, c2_is_int;
> 
>/* If the exponent isn't a constant, there's nothing of interest
>   to be done.  */
> @@ -1122,8 +1122,9 @@ gimple_expand_builtin_pow (gimple_stmt_i
>c = TREE_REAL_CST (arg1);
>n = real_to_integer (&c);
>real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
> +  c_is_int = real_identical (&c, &cint);
> 
> -  if (real_identical (&c, &cint)
> +  if (c_is_int
>&& ((n >= -1 && n <= 2)
> || (flag_unsafe_math_optimizations
> && optimize_insn_for_speed_p ()
> @@ -1221,7 +1222,8 @@ gimple_expand_builtin_pow (gimple_stmt_i
>return build_and_insert_call (gsi, loc, cbrtfn, sqrt_arg0);
>  }
> 
> -  /* Optimize pow(x,c), where n = 2c for some nonzero integer n, into
> +  /* Optimize pow(x,c), where n = 2c for some nonzero integer n
> + and c not an integer, into
> 
> sqrt(x) * powi(x, n/2),n > 0;
> 1.0 / (sqrt(x) * powi(x, abs(n/2))),   n < 0.
> @@ -1230,10 +1232,13 @@ gimple_expand_builtin_pow (gimple_stmt_i
>real_arithmetic (&c2, MULT_EXPR, &c, &dconst2);
>n = real_to_integer (&c2);
>real_from_integer (&cint, VOIDmode, n, n < 0 ? -1 : 0, 0);
> +  c2_is_int = real_identical (&c2, &cint);
> 
>if (flag_unsafe_math_optimizations
>&& sqrtfn
> -  && real_identical (&c2, &cint))
> +  && c2_is_int
> +  && !c_is_int
> +  && optimize_function_for_speed_p (cfun))
>  {
>tree powi_x_ndiv2 = NULL_TREE;
> 
> @@ -1286,6 +1291,7 @@ gimple_expand_builtin_pow (gimple_stmt_i
>&& cbrtfn
>&& (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
>&& real_identical (&c2, &c)
> +  && !c2_is_int
>&& optimize_function_for_speed_p (cfun)
>&& powi_cost (n / 3) <= POWI_MAX_MULTS)
>  {
> --- gcc/testsuite/gcc.dg/pr56125.c.jj 2013-01-28 11:00:04.359814742 +0100
> +++ gcc/testsuite/gcc.dg/pr56125.c2013-01-28 11:00:55.048532118 +0100
> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/56125 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math" } */
> +
> +extern void abort (void);
> +extern double fabs (double);
> +
> +__attribute__((cold)) double
> +foo (double x, double n)
> +{
> +  double u = x / (n * n);
> +  return u;
> +}
> +
> +int
> +main ()
> +{
> +  if (fabs (foo (29, 2) - 7.25) > 0.001)
> +abort ();
> +  return 0;
> +}
> 
>   Jakub
> 



Re: [PATCH][RFC] Avoid excessive BLOCK associations for locations

2013-01-28 Thread Michael Matz
Hi,

On Mon, 28 Jan 2013, Richard Biener wrote:

> What's the point of switching to the outermost scope for unknown-BLOCK
> locations?

It's the most sensical block for code which isn't otherwise associated 
with a BLOCK.  But the latter Shouldn't Happen, because conceptually all 
code runs in some (perhaps artificial) lookup context.  So, it's actually 
not the inliner which should fixup stuff here, but rather ...

> If we have a non-UNKNOWN_LOCATION, would a NULL BLOCK not be an error 
> anyway?

... whatever is producing such non-BLOCK code snippets.  But see below.

> Isn't inheriting the currently active scope much more useful (it 
> definitely is for UNKNOWN_LOCATIONs)?

And yes, the most likely useful block for such code will be the "currently 
active" block.  This is true only before code transformations of course; 
while optimizing you have the same problems like with locations, i.e. how 
to "merge" multiple different BLOCKs into one sensible.

Now, as an implementation optimization (to not bloat the location/block 
sets perhaps) you can define block==NULL <--> block==outermost-scope, and 
in case you do so, it's indeed the inliner that needs to map NULL blocks 
to the mapped outermost scope of the inlined function.  I would guess that 
this is what historically was done, and when this optimization is still 
employed your patch is wrong.  IMHO this optimization should be used.


Ciao,
Michael.


Re: [Ping] [Patch, AArch64] Set libgloss_dir for aarch64*-*-* targets

2013-01-28 Thread Yufeng Zhang

Ping~

On 01/10/13 16:20, Yufeng Zhang wrote:

Hi,

This patch updates the top-level configuration files to explicitly set
libgloss_dir to aarch64 for aarch64*-*-* targets.

OK to commit?

Thanks,
Yufeng

2013-01-10  Yufeng Zhang

  * configure.ac: Set libgloss_dir for the aarch64*-*-* targets.
  * configure: Regenerated.


top-level-config.patch


diff --git a/configure.ac b/configure.ac
index 02720ee..5bdf1d0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -759,6 +759,9 @@ case "${target}" in
sh*-*-pe|mips*-*-pe|*arm-wince-pe)
  libgloss_dir=wince
  ;;
+  aarch64*-*-* )
+libgloss_dir=aarch64
+;;
arm*-*-*)
  libgloss_dir=arm
  ;;






Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld

2013-01-28 Thread David Edelsohn
On Mon, Jan 28, 2013 at 4:07 AM, Michael Haubenwallner
 wrote:

> But still curious if you've been able to reproduce the problem,
> and why you didn't encounter this problem beforehand.

As I mentioned before, because of --boot-ld-flags, with earlier libgcc
and libstdc++ installed in that directory.

> Yes, but (you've asked) here is this situation I don't want to configure 
> extra deplib-prefixes
> for (remember bullfreeware is listed as provider for gcc-binaries):
>
> * bullfreeware's libiconv-1.13.1 and gettext-0.17 is installed in 
> /opt/freeware,
> * /usr/lib/libintl.a is symlinked to /opt/freeware/lib (by bullfreeware's 
> RPM),
> * /usr/lib/libiconv.a is the original AIX' one.
>
> Now, /usr/lib/libintl.a needs /opt/freeware/lib/libiconv.a[libiconv.so.2], 
> and it does
> contain the correct RUNPATH. But subsequent binaries linking against 
> /usr/lib/libintl.a
> don't (necessarily) know about the need to add /opt/freeware/lib as RUNPATH, 
> so these
> binaries break with libiconv.so.2 not being found as member of 
> /usr/lib/libiconv.a, because
> AIX unfortunately does stop its shared-library search at the first archive 
> filename found.
>
> This also is the main reason for my filename-based-shared-library-versioning 
> thing.

Over the weekend, I successfully tested a different way to configure
and build: all static libraries.  If you build and privately install
GMP, MPFR, MPC and LIBICONV configured as static libraries
(--enable-static --disable-shared) and install in /prereq, then,
combined with your patch to enable --static-libstdc++ --static-libgcc,
the resulting GCC only depends on AIX libc.a -- no other shared
libraries. Bull Freeware can distribute the shared versions of the
libraries for other applications, but they do not need to be GCC
dependencies.

- David


Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support

2013-01-28 Thread nick clifton

Hi Matt,


Could this patch, or perhaps the much smaller one I attached to bug
35294 be committed to the 4.7 branch?


Yes.  Done.


Also, could you close its duplicates, bugs 36798 and 36966?


Sorry no.  I do not actually own these PRs, so I cannot close them. :-(

Cheers
  Nick




Re: [doc,committed] Fix missing ':' in inline asm example

2013-01-28 Thread Georg-Johann Lay
Georg-Johann Lay wrote:
> Applied as obvious:
> 
> http://gcc.gnu.org/r195471
> 
> 
>   * doc/extend.texi (Example of asm with clobbered asm reg): Fix
>   missing ':' in asm example.
> 

The patch:


--- trunk/gcc/doc/extend.texi   2013/01/25 17:55:09 195470
+++ trunk/gcc/doc/extend.texi   2013/01/25 18:11:53 195471
@@ -6062,7 +6062,7 @@
   int *y = &x;
   int result;
   asm ("magic stuff accessing an 'int' pointed to by '%1'"
-"=&d" (r) : "a" (y), "m" (*y));
+   : "=&d" (r) : "a" (y), "m" (*y));
   return result;
 @}
 @end smallexample


Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Marc Glisse

On Mon, 28 Jan 2013, Jakub Jelinek wrote:


2013-01-28  Jakub Jelinek  

PR tree-optimization/56125
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
pow(x,c) into sqrt(x) * powi(x, n/2) or
1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
optimizing for size.
Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
integer.

* gcc.dg/pr56125.c: New test.


Hello,

is there an implicit -lm in the testsuite?

The testcase now generates a library call to pow, like gcc-4.6. This is 
correct, but I am surprised this is considered better than leaving the 
original x/(n*n) unchanged... Should that be a different PR?


--
Marc Glisse


Re: [v3] Fix management of non empty hash functor

2013-01-28 Thread Jonathan Wakely
On 10 January 2013 21:02, François Dumont wrote:
> Hi
>
> Here is an other version of this patch. Indeed there were no need to
> expose many stuff public. Inheriting from _Hash_code_base is fine, it is not
> final and it deals with EBO itself. I only kept usage of
> _Hashtable_ebo_helper when embedding H2 functor. As it is an extension we
> could have impose it not to be final but it doesn't cost a lot to deal with
> it. Finally I only needed a single friend declaration to get access to the
> H2 part of _Hash_code_base.

OK.

> I didn't touch the default cache policy for the moment except reducing
> constraints on the hash functor. I prefer to submit an other patch to change
> when we cache or not depending on the hash functor expected performance.

OK.  The reduced constraints are good.  Does this actually affect
performance?  In my tests it doesn't, so I assume we still need to
change the caching decision to notice any performance improvements?

(Do the performance benchmarks actually tell us anything useful?
When I run them I get such varying results it doesn't seem to be reliable.)

> I also took the time to replace some typedef expressions with using
> ones. I really know what is the rule about using one or the other but I
> remembered that Benjamin spent quite some time changing typedef in using so
> I prefer to stick to this approach in this file, even if there are still
> some typedef left.

OK, that doesn't make any difference so isn't important which is used.


> Tested under linux x86_64 normal and debug modes.
>
> 2013-01-10  François Dumont  
>
>
> * include/bits/hashtable_policy.h (_Local_iterator_base): Use
> _Hashtable_ebo_helper to embed necessary functors into the
> local_iterator when necessary. Pass information about functors

Repeating "necessary" seems unnecessary here :)

> involved in hash code by copy.
> * include/bits/hashtable.h (__cache_default): Do not cache for
> builtin integral types unless the hash functor is not noexcept
> qualified or is not default constructible. Adapt static assertions
> and local iteraror instantiations.

^^ "iteraror"

+  // When hash codes are not cached local iterator inherits from
+  // __hash_code_base above to compute node bucket index so it has to be
+  // default constructible.
+  static_assert(__if_hash_not_cached<
+ is_default_constructible<__hash_code_base>>::value,
+   "Cache the hash code or make functors involved in hash code"
+   " and bucket index computation default constructibles");

"constructible" not "constructibles"

This is OK for trunk, but not 4.7


Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Jakub Jelinek
On Mon, Jan 28, 2013 at 04:41:31PM +0100, Marc Glisse wrote:
> On Mon, 28 Jan 2013, Jakub Jelinek wrote:
> 
> >2013-01-28  Jakub Jelinek  
> >
> > PR tree-optimization/56125
> > * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
> > pow(x,c) into sqrt(x) * powi(x, n/2) or
> > 1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
> > optimizing for size.
> > Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
> > 1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
> > integer.
> >
> > * gcc.dg/pr56125.c: New test.
> 
> is there an implicit -lm in the testsuite?

Yes.

> The testcase now generates a library call to pow, like gcc-4.6. This
> is correct, but I am surprised this is considered better than
> leaving the original x/(n*n) unchanged... Should that be a different
> PR?

The function in question is marked as cold, therefore it should be optimized
for size.  The call to pow is certainly shorter than the sqrt,
multiplication, division etc.

Jakub


Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Marc Glisse

On Mon, 28 Jan 2013, Jakub Jelinek wrote:


On Mon, Jan 28, 2013 at 04:41:31PM +0100, Marc Glisse wrote:

On Mon, 28 Jan 2013, Jakub Jelinek wrote:


2013-01-28  Jakub Jelinek  

PR tree-optimization/56125
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Don't optimize
pow(x,c) into sqrt(x) * powi(x, n/2) or
1.0 / (sqrt(x) * powi(x, abs(n/2))) if c is an integer or when
optimizing for size.
Don't optimize pow(x,c) into powi(x, n/3) * powi(cbrt(x), n%3) or
1.0 / (powi(x, abs(n)/3) * powi(cbrt(x), abs(n)%3)) if 2c is an
integer.

* gcc.dg/pr56125.c: New test.



The testcase now generates a library call to pow, like gcc-4.6. This
is correct, but I am surprised this is considered better than
leaving the original x/(n*n) unchanged... Should that be a different
PR?


The function in question is marked as cold, therefore it should be optimized
for size.  The call to pow is certainly shorter than the sqrt,
multiplication, division etc.


There is no sqrt, x/(n*n) is just one mul and one div, whereas with the 
call I see one mul, 3 movs to prepare for the call, and the call.


--
Marc Glisse


Re: [PATCH] Fix up pow folding (PR tree-optimization/56125)

2013-01-28 Thread Jakub Jelinek
On Mon, Jan 28, 2013 at 05:07:10PM +0100, Marc Glisse wrote:
> There is no sqrt, x/(n*n) is just one mul and one div, whereas with
> the call I see one mul, 3 movs to prepare for the call, and the
> call.

Ah, you're talking about the checked in testcase, rather than the one I've
mentioned in the description whether the speed guard is desirable there or
not.  In the checked in testcase, the problem with code size is far earlier
than that, already during folding that
double u = x / (n * n);
is replaced by:
double u = x * __builtin_pow (n, -2.0e+0);
And this isn't something you can then size optimize in the pow folder on its
own, return pow (n, -2.0e); will be supposedly shorter than
return 1.0 / (n * n), the folding doesn't see that this is used in
multiplication which could be perhaps changed into division instead.

Jakub


Re: [PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)

2013-01-28 Thread Jeff Law

On 01/28/2013 07:14 AM, Jakub Jelinek wrote:

Hi!

We ICE on the following testcase when using cselib, because
cselib_lookup* is never called on the PREFETCH argument, and
add_insn_mem_dependence calls cselib_subst_to_values on it, which
assumes cselib_lookup* already happened on it earlier.
For MEMs sched_analyze_2 calls cselib_lookup_from_insn, but for PREFETCHes
it didn't.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2013-01-28  Jakub Jelinek  

PR rtl-optimization/56117
* sched-deps.c (sched_analyze_2) : For use_cselib
call cselib_lookup_from_insn on the MEM before calling
add_insn_mem_dependence.

* gcc.dg/pr56117.c: New test.
I'm assuming that we don't need the shallow_copy_rtx call and related 
code because in the PREFETCH case we generate a new MEM and the 
underlying address can be safely shared.  Right?


If that's true, OK.

jeff




Re: [PATCH] Fix sched ICE with prefetch (PR rtl-optimization/56117)

2013-01-28 Thread Jakub Jelinek
On Mon, Jan 28, 2013 at 09:39:00AM -0700, Jeff Law wrote:
> I'm assuming that we don't need the shallow_copy_rtx call and
> related code because in the PREFETCH case we generate a new MEM and
> the underlying address can be safely shared.  Right?

AFAIK cselib_lookup* never modifies the rtx it is passed,
shallow_copy_rtx in the MEM case is for:
  t = shallow_copy_rtx (dest);
  cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
   GET_MODE (t), insn);
  XEXP (t, 0)
= cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
insn);
where we modify XEXP (t, 0) in the last assignment and don't want to change
XEXP (dest, 0).

Jakub


Re: [committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)

2013-01-28 Thread Jeff Law

On 01/28/2013 07:09 AM, Jakub Jelinek wrote:

Hi!

As discussed in the PR, this is a safer variant of a fix for 4.8, where
input_location during most optimization passes is set to
DECL_SOURCE_LOCATION (current_function_decl) and various parts of the
gimplifier e.g. during force_gimple_operand* may end up setting
gimple_location to that.  For 4.9, we should revert this and set
input_location to UNKNOWN_LOCATION for the optimizers.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2013-01-28  Jakub Jelinek  

PR tree-optimization/56094
* gimplify.c (force_gimple_operand_1): Temporarily set input_location
to UNKNOWN_LOCATION while gimplifying expr.

* gcc.dg/pr56094.c: New test.
Based on c#15, we should probably consider this a bit of a band-aid, 
right?   Thus we'll install the band-aid, but keep the PR open pending a 
better solution for handling of input_location, correct?


jeff

+/* Verify no statements get the location of the foo () decl.  */
+/* { dg-final { scan-tree-dump-not " : 65:1\\\]" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Jakub





Re: [committed] Avoid setting gimple_location of force_gimple_operand* created stmts to DECL_SOURCE_LOCATION of current fn (PR tree-optimization/56094)

2013-01-28 Thread Jakub Jelinek
On Mon, Jan 28, 2013 at 09:50:35AM -0700, Jeff Law wrote:
> >2013-01-28  Jakub Jelinek  
> >
> > PR tree-optimization/56094
> > * gimplify.c (force_gimple_operand_1): Temporarily set input_location
> > to UNKNOWN_LOCATION while gimplifying expr.
> >
> > * gcc.dg/pr56094.c: New test.
> Based on c#15, we should probably consider this a bit of a band-aid,
> right?

Yeah.

> Thus we'll install the band-aid, but keep the PR open
> pending a better solution for handling of input_location, correct?

That's what I did.

Jakub


[Patch,avr] Remove fixed-point MUL and DIV routines from libgcc build

2013-01-28 Thread Georg-Johann Lay
This removes modules from libgcc that are already supported by avr-specific
fixed-point implementation and avoids duplicate functions like __mulsa3.

Ok for trunk?

Johann


libgcc/
* config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add:
_mulQQ,  _mulHQ,  _mulHA,  _mulSA,
_mulUQQ, _mulUHQ, _mulUHA, _mulUSA,
_divQQ,  _divHQ,  _divHA,  _divSA,
_divUQQ, _divUHQ, _divUHA, _divUSA.
Index: config/avr/t-avr
===
--- config/avr/t-avr	(revision 195301)
+++ config/avr/t-avr	(working copy)
@@ -164,3 +164,17 @@ LIB2FUNCS_EXCLUDE += \
 LIB2FUNCS_EXCLUDE += \
 	$(foreach func,_usadd _ussub _usneg,\
 	$(foreach mode,$(usat_modes),$(func_X)))
+
+
+smul_modes =  QQ  HQ  HA  SA
+umul_modes = UQQ UHQ UHA USA
+sdiv_modes =  QQ  HQ  HA  SA
+udiv_modes = UQQ UHQ UHA USA
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach func,_mul,\
+	$(foreach mode,$(smul_modes) $(umul_modes),$(func_X)))
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach func,_div,\
+	$(foreach mode,$(sdiv_modes) $(udiv_modes),$(func_X)))


Re: [Patch] Fix PR54814

2013-01-28 Thread Jeff Law

On 01/27/2013 03:26 PM, Steven Bosscher wrote:

On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote:

The patch was originally worked out by Bernd Schmidt and fixed a problem
introduced in

http://gcc.gnu.org/r190252


Ironically, this revision fixes a reload problem on x86/x86_64 --
which doesn't use reload anymore now...



Does this mean the fix is rejected for 4.8?


No, just that it probably helps to add a RM to the CC list.

FWIW, it seems to me that this patch should go into 4.8, because the
bug is probably not limited to AVR.
At this stage, I tend to be more conservative.  However, it looks like 
Ulrich & Richi have taken a looksie and think the patch is fine.  I'm 
certainly not going to object.


jeff


Re: [Patch] Fix PR54814

2013-01-28 Thread Jeff Law

On 01/27/2013 03:09 PM, Georg-Johann Lay wrote:



If not, it'll probably need release manager approval before it can go
in.

Please attach your patch to PR54814 and attach PR 54814 to the 4.9
pending patches meta bug.


Does this mean the fix is rejected for 4.8?
Not necessarily.  We're in a regression bugfix only stage; so 
regressions can obviously be fixed.  If a change does not fix a 
regression, then it really needs the release manager's approval to go 
forward at this stage.


Jeff


Re: [Patch] Fix PR54814

2013-01-28 Thread Jeff Law

On 01/28/2013 06:55 AM, Ulrich Weigand wrote:

Richard Biener wrote:

On Sun, Jan 27, 2013 at 11:26 PM, Steven Bosscher  wrote:

On Sun, Jan 27, 2013 at 11:09 PM, Georg-Johann Lay wrote:

The patch was originally worked out by Bernd Schmidt and fixed a problem
introduced in

http://gcc.gnu.org/r190252


Ironically, this revision fixes a reload problem on x86/x86_64 --
which doesn't use reload anymore now...



Does this mean the fix is rejected for 4.8?


No, just that it probably helps to add a RM to the CC list.

FWIW, it seems to me that this patch should go into 4.8, because the
bug is probably not limited to AVR.


Indeed, the fix also looks quite obvious though I know nothing about the
code at all.

Thus, ok from a RM perspective if a reload-affine person approves it.


The patch was originally by Bernd, but FWIW it looks good to me as well.
Now that I know this is a regression, I've looked at it more closely and 
it looks good to me too.


George-Johann, please install this onto the trunk.  Thanks,

Jeff



Re: RFA: RL78: Allow SP to be used as a base register

2013-01-28 Thread DJ Delorie

>   Please may I apply the patch below.  It fixes the RL78 backend so that
>   the stack register can be used as a base address register.

Yes, please.  Thanks!



Re: [PATCH] Adding target rdos to GCC

2013-01-28 Thread Leif Ekblad

Uros,

That is intentional. The gthr-rdos.h file is part of libgcc. My intention 
was to first patch gcc, then update the patches for newlib, and finally 
libgcc. The gthr-rdos.h file would reference include-files part of newlib, 
so this is kind of circular. I also cannot define the thread model for RDOS 
unless I define this file.


I see a couple of possible solutions:
1. Keep as is. You cannot build libgcc at the current stage anyway, and the 
bootstrap must be built without threading

2. Add an empty gthr-rdos.h file until libgcc is done
3. Remove the threading-model for now, and add it with libgcc instead.

Regards,
Leif Ekblad



- Original Message - 
From: "Uros Bizjak" 

To: "Leif Ekblad" 
Cc: "Richard Biener" ; ; "H.J. 
Lu" ; "Jakub Jelinek" 

Sent: Monday, January 28, 2013 8:23 AM
Subject: Re: [PATCH] Adding target rdos to GCC



On Mon, Jan 28, 2013 at 7:50 AM, Leif Ekblad  wrote:


If the patch is ok, could some maintainer add it to trunk?


There is no gthr-rdos.h file in your patch:

*** gcc-4.8-20121230/config/gthr.m4 2012-10-15 15:10:30.0 +0200
--- gcc-work/config/gthr.m4 2013-01-07 10:14:04.620667900 +0100
***
*** 21,26 
--- 21,27 
 tpf) thread_header=config/s390/gthr-tpf.h ;;
 vxworks) thread_header=config/gthr-vxworks.h ;;
 win32) thread_header=config/i386/gthr-win32.h ;;
+ rdos) thread_header=config/i386/gthr-rdos.h ;;

This file should be part of libgcc, so it needs its own ChangeLog.

Uros. 




Re: [PATCH] Adding target rdos to GCC

2013-01-28 Thread Uros Bizjak
On Mon, Jan 28, 2013 at 8:57 PM, Leif Ekblad  wrote:

> That is intentional. The gthr-rdos.h file is part of libgcc. My intention
> was to first patch gcc, then update the patches for newlib, and finally
> libgcc. The gthr-rdos.h file would reference include-files part of newlib,
> so this is kind of circular. I also cannot define the thread model for RDOS
> unless I define this file.
>
> I see a couple of possible solutions:
> 1. Keep as is. You cannot build libgcc at the current stage anyway, and the
> bootstrap must be built without threading
> 2. Add an empty gthr-rdos.h file until libgcc is done
> 3. Remove the threading-model for now, and add it with libgcc instead.

I propose option 3.

Is it enough to remove gthr.m4 change from the patch in this case?

Uros.


Re: [PATCH] Adding target rdos to GCC

2013-01-28 Thread Leif Ekblad


- Original Message - 
From: "Uros Bizjak" 

To: "Leif Ekblad" 
Cc: "Richard Biener" ; ; "H.J. 
Lu" ; "Jakub Jelinek" 

Sent: Monday, January 28, 2013 9:03 PM
Subject: Re: [PATCH] Adding target rdos to GCC



On Mon, Jan 28, 2013 at 8:57 PM, Leif Ekblad  wrote:


That is intentional. The gthr-rdos.h file is part of libgcc. My intention
was to first patch gcc, then update the patches for newlib, and finally
libgcc. The gthr-rdos.h file would reference include-files part of 
newlib,
so this is kind of circular. I also cannot define the thread model for 
RDOS

unless I define this file.

I see a couple of possible solutions:
1. Keep as is. You cannot build libgcc at the current stage anyway, and 
the

bootstrap must be built without threading
2. Add an empty gthr-rdos.h file until libgcc is done
3. Remove the threading-model for now, and add it with libgcc instead.


I propose option 3.

Is it enough to remove gthr.m4 change from the patch in this case?

Uros.


Yes, for all practical purposes. There is a reference to thread-file in 
config.gcc, when threading is enabled, which doesn't work for bootstrapping 
the compiler anyway.


Regards,
Leif Ekblad



[SH] PR 56121 - fix libgcc build for SH2A

2013-01-28 Thread Oleg Endo
Hi,

This is the same patch that I attached in the PR.
It fixes an ICE when building libgcc for the SH2A target.
Tested on rev. 195493 with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

... comparing the test results against rev 193342 shows a few new
failures, but they seem unrelated to this case.

OK for trunk?

Cheers,
Oleg

gcc/ChangeLog:

PR target/56121
* config/sh/sh.md (bclr_m2a, bset_m2a, bst_m2a, bld_m2a, 
bldsign_m2a, bld_reg, *bld_regqi, band_m2a, bandreg_m2a, 
bor_m2a, borreg_m2a, bxor_m2a, bxorreg_m2a): Add 
satisfies_constraint_K03 condition.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 195493)
+++ gcc/config/sh/sh.md	(working copy)
@@ -13140,6 +13140,8 @@
 })
 
 ;; SH2A instructions for bitwise operations.
+;; FIXME: Convert multiple instruction insns to insn_and_split.
+;; FIXME: Use iterators to fold at least and,xor,or insn variations.
 
 ;; Clear a bit in a memory location.
 (define_insn "bclr_m2a"
@@ -13148,7 +13150,7 @@
 	(not:QI (ashift:QI (const_int 1)
 			(match_operand:QI 1 "const_int_operand" "K03,K03")))
 	(match_dup 0)))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bclr.b	%1,%0
 	bclr.b	%1,@(0,%t0)"
@@ -13171,7 +13173,7 @@
 	(ashift:QI (const_int 1)
 		   (match_operand:QI 1 "const_int_operand" "K03,K03"))
 	(match_dup 0)))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bset.b	%1,%0
 	bset.b	%1,@(0,%t0)"
@@ -13198,7 +13200,7 @@
 	(ior:QI
 		(ashift:QI (const_int 1) (match_dup 1))
 		(match_dup 0]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bst.b	%1,%0
 	bst.b	%1,@(0,%t0)"
@@ -13211,7 +13213,7 @@
 	(match_operand:QI 0 "bitwise_memory_operand" "Sbw,Sbv")
 	(const_int 1)
 	(match_operand 1 "const_int_operand" "K03,K03")))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bld.b	%1,%0
 	bld.b	%1,@(0,%t0)"
@@ -13224,7 +13226,7 @@
 	(match_operand:QI 0 "bitwise_memory_operand" "Sbw,m")
 	(const_int 1)
 	(match_operand 1 "const_int_operand" "K03,K03")))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bld.b	%1,%0
 	bld.b	%1,@(0,%t0)"
@@ -13236,7 +13238,7 @@
 	(zero_extract:SI (match_operand:SI 0 "arith_reg_operand" "r")
 			 (const_int 1)
 			 (match_operand 1 "const_int_operand" "K03")))]
-  "TARGET_SH2A"
+  "TARGET_SH2A && satisfies_constraint_K03 (operands[1])"
   "bld	%1,%0")
 
 (define_insn "*bld_regqi"
@@ -13244,7 +13246,7 @@
 	(zero_extract:SI (match_operand:QI 0 "arith_reg_operand" "r")
 			 (const_int 1)
 			 (match_operand 1 "const_int_operand" "K03")))]
-  "TARGET_SH2A"
+  "TARGET_SH2A && satisfies_constraint_K03 (operands[1])"
   "bld	%1,%0")
 
 ;; Take logical and of a specified bit of memory with the T bit and
@@ -13256,7 +13258,7 @@
 		(match_operand:QI 0 "bitwise_memory_operand" "Sbw,m")
 		(const_int 1)
 		(match_operand 1 "const_int_operand" "K03,K03"]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	band.b	%1,%0
 	band.b	%1,@(0,%t0)"
@@ -13269,7 +13271,7 @@
 		(const_int 1)
 		(match_operand 2 "const_int_operand" "K03,K03"))
 	(match_operand:SI 3 "register_operand" "r,r")))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[2])"
 {
   static const char* alt[] =
   {
@@ -13292,7 +13294,7 @@
 		(match_operand:QI 0 "bitwise_memory_operand" "Sbw,m")
 		(const_int 1)
 		(match_operand 1 "const_int_operand" "K03,K03"]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bor.b	%1,%0
 	bor.b	%1,@(0,%t0)"
@@ -13305,7 +13307,7 @@
 		(const_int 1)
 		(match_operand 2 "const_int_operand" "K03,K03"))
 		(match_operand:SI 3 "register_operand" "=r,r")))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[2])"
 {
   static const char* alt[] =
   {
@@ -13328,7 +13330,7 @@
 		(match_operand:QI 0 "bitwise_memory_operand" "Sbw,m")
 		(const_int 1)
 		(match_operand 1 "const_int_operand" "K03,K03"]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constraint_K03 (operands[1])"
   "@
 	bxor.b	%1,%0
 	bxor.b	%1,@(0,%t0)"
@@ -13341,7 +13343,7 @@
 		(const_int 1)
 		(match_operand 2 "const_int_operand" "K03,K03"))
 		(match_operand:SI 3 "register_operand" "=r,r")))]
-  "TARGET_SH2A && TARGET_BITOPS"
+  "TARGET_SH2A && TARGET_BITOPS && satisfies_constrain

Re: [PATCH] Adding target rdos to GCC

2013-01-28 Thread Uros Bizjak
On Mon, Jan 28, 2013 at 9:14 PM, Leif Ekblad  wrote:

>>> That is intentional. The gthr-rdos.h file is part of libgcc. My intention
>>> was to first patch gcc, then update the patches for newlib, and finally
>>> libgcc. The gthr-rdos.h file would reference include-files part of
>>> newlib,
>>> so this is kind of circular. I also cannot define the thread model for
>>> RDOS
>>> unless I define this file.
>>>
>>> I see a couple of possible solutions:
>>> 1. Keep as is. You cannot build libgcc at the current stage anyway, and
>>> the
>>> bootstrap must be built without threading
>>> 2. Add an empty gthr-rdos.h file until libgcc is done
>>> 3. Remove the threading-model for now, and add it with libgcc instead.
>>
>>
>> I propose option 3.
>>
>> Is it enough to remove gthr.m4 change from the patch in this case?
>
> Yes, for all practical purposes. There is a reference to thread-file in
> config.gcc, when threading is enabled, which doesn't work for bootstrapping
> the compiler anyway.

Thanks for pointing it, I have also removed this reference.

Attached is the patch that has been committed to SVN. I have added
missing licence headers to new files and clean whitespace a bit.

2013-01-28  Leif Ekblad  

* config.gcc (i[34567]86-*-rdos*, x86_64-*-rdos*): New targets.
* config/i386/i386.h (TARGET_RDOS): New macro.
(DEFAULT_LARGE_SECTION_THRESHOLD): New macro.
* config/i386/i386.c (ix86_option_override_internal): For 64bit
TARGET_RDOS, set ix86_cmodel to CM_MEDIUM_PIC and flag_pic to 1.
* config/i386/i386.opt (mlarge-data-threshold): Initialize to
DEFAULT_LARGE_SECTION_THRESHOLD.
* config/i386/i386.md (R14_REG, R15_REG): New constants.
* config/i386/rdos.h: New file.
* config/i386/rdos64.h: New file.

Thanks,
Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 195515)
+++ config/i386/i386.c  (working copy)
@@ -3235,10 +3235,12 @@ ix86_option_override_internal (bool main_args_p)
 DLL, and is essentially just as efficient as direct addressing.  */
   if (TARGET_64BIT && DEFAULT_ABI == MS_ABI)
ix86_cmodel = CM_SMALL_PIC, flag_pic = 1;
+  else if (TARGET_64BIT && TARGET_RDOS)
+   ix86_cmodel = CM_MEDIUM_PIC, flag_pic = 1;
   else if (TARGET_64BIT)
ix86_cmodel = flag_pic ? CM_SMALL_PIC : CM_SMALL;
   else
-ix86_cmodel = CM_32;
+   ix86_cmodel = CM_32;
 }
   if (TARGET_MACHO && ix86_asm_dialect == ASM_INTEL)
 {
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 195515)
+++ config/i386/i386.h  (working copy)
@@ -518,6 +518,9 @@ extern tree x86_mfence;
 #define MACHOPIC_INDIRECT 0
 #define MACHOPIC_PURE 0
 
+/* For the RDOS  */
+#define TARGET_RDOS 0
+
 /* For the Windows 64-bit ABI.  */
 #define TARGET_64BIT_MS_ABI (TARGET_64BIT && ix86_cfun_abi () == MS_ABI)
 
@@ -2081,6 +2084,10 @@ do { 
\
asm (SECTION_OP "\n\t"  \
"call " CRT_MKSTR(__USER_LABEL_PREFIX__) #FUNC "\n" \
TEXT_SECTION_ASM_OP);
+
+/* Default threshold for putting data in large sections
+   with x86-64 medium memory model */
+#define DEFAULT_LARGE_SECTION_THRESHOLD 65536
 
 /* Which processor to tune code generation for.  */
 
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 195515)
+++ config/i386/i386.md (working copy)
@@ -300,6 +300,8 @@
(R11_REG40)
(R12_REG41)
(R13_REG42)
+   (R14_REG43)
+   (R15_REG44)
(XMM8_REG   45)
(XMM9_REG   46)
(XMM10_REG  47)
Index: config/i386/i386.opt
===
--- config/i386/i386.opt(revision 195515)
+++ config/i386/i386.opt(working copy)
@@ -140,7 +140,7 @@ Target RejectNegative Joined UInteger Var(ix86_bra
 Branches are this expensive (1-5, arbitrary units)
 
 mlarge-data-threshold=
-Target RejectNegative Joined UInteger Var(ix86_section_threshold) Init(65536)
+Target RejectNegative Joined UInteger Var(ix86_section_threshold) 
Init(DEFAULT_LARGE_SECTION_THRESHOLD)
 Data greater than given threshold will go into .ldata section in x86-64 medium 
model
 
 mcmodel=
Index: config/i386/rdos.h
===
--- config/i386/rdos.h  (revision 0)
+++ config/i386/rdos.h  (working copy)
@@ -0,0 +1,33 @@
+/* Definitions for RDOS on i386.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as publi

Re: [Patch, fortran] PR56008 (and PR47517) [F03] wrong code with lhs-realloc on assignment with derived types having allocatable components

2013-01-28 Thread Paul Richard Thomas
**ping**

On 23 January 2013 11:06, Tobias Burnus  wrote:
> Paul Richard Thomas wrote:
>>
>> *** gfc_alloc_allocatable_for_assignment (gf
>> *** 8224,8229 
>> --- 8250,8262 
>> desc, tmp);
>>  tmp = gfc_conv_descriptor_dtype (desc);
>>  gfc_add_modify (&alloc_block, tmp, gfc_get_dtype (TREE_TYPE (desc)));
>> +   if ((expr1->ts.type == BT_DERIVED)
>> +   && expr1->ts.u.derived->attr.alloc_comp)
>> + {
>> +   tmp = gfc_nullify_alloc_comp (expr1->ts.u.derived, desc,
>> +   expr1->rank);
>> +   gfc_add_expr_to_block (&alloc_block, tmp);
>> + }
>>  alloc_expr = gfc_finish_block (&alloc_block);
>
>
> When glancing at the patch, I wondered whether it would be better to use
> CALLOC instead of MALLOC and avoid the nullification:
>
> /* Malloc expression. */
> gfc_init_block (&alloc_block);
> tmp = build_call_expr_loc (input_location,
> builtin_decl_explicit (BUILT_IN_MALLOC),
> 1, size2);
>
> On the other hand, the nullification is probably still required for REALLOC.
> If so, the question is whether CALLOC + nullify in the realloc branch - or
> malloc + nullify after the realloc/malloc branches is better. Hence, your
> version is probably fine.
>
> Sorry for not yet reviewing your patch.
>
> Tobias
>
> PS: Regarding "allocatable" and "memory leak": PR55603 has as similar issue.
> For scalars, gfortran never frees allocatable function results; that's
> independent of the LHS (allocatable, pointer, neither). Thus, if you are in
> the mood of fixing those kind of bugs … (Actually, I am not even sure
> whether that's restricted to allocation, it might also occur with
> expressions like "a = f() + 5". Untested.)



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


Re: [v3] Fix management of non empty hash functor

2013-01-28 Thread François Dumont

Attached patch applied.

2013-01-28  François Dumont  

* include/bits/hashtable_policy.h (_Local_iterator_base): Use
_Hashtable_ebo_helper to embed functors into the local_iterator
when necessary. Pass information about functors involved in hash
code by copy.
* include/bits/hashtable.h (__cache_default): Do not cache for
builtin integral types unless the hash functor is not noexcept
qualified or is not default constructible. Adapt static assertions
and local iterator instantiations.
* include/debug/unordered_set
(std::__debug::unordered_set<>::erase): Detect local iterators to
invalidate using contained node rather than generating a dummy
local_iterator instance.
(std::__debug::unordered_multiset<>::erase): Likewise.
* include/debug/unordered_map
(std::__debug::unordered_map<>::erase): Likewise.
(std::__debug::unordered_multimap<>::erase): Likewise.
* testsuite/performance/23_containers/insert_erase/41975.cc: Test
std::tr1 and std versions of unordered_set regardless of any
macro. Add test on default cache behavior.
* testsuite/performance/23_containers/insert/54075.cc: Likewise.
* testsuite/23_containers/unordered_set/instantiation_neg.cc:
Adapt line number.
* testsuite/23_containers/unordered_set/
not_default_constructible_hash_neg.cc: New.
* testsuite/23_containers/unordered_set/buckets/swap.cc: New.

On 01/28/2013 04:42 PM, Jonathan Wakely wrote:

On 10 January 2013 21:02, François Dumont wrote:

Hi

 Here is an other version of this patch. Indeed there were no need to
expose many stuff public. Inheriting from _Hash_code_base is fine, it is not
final and it deals with EBO itself. I only kept usage of
_Hashtable_ebo_helper when embedding H2 functor. As it is an extension we
could have impose it not to be final but it doesn't cost a lot to deal with
it. Finally I only needed a single friend declaration to get access to the
H2 part of _Hash_code_base.

OK.


 I didn't touch the default cache policy for the moment except reducing
constraints on the hash functor. I prefer to submit an other patch to change
when we cache or not depending on the hash functor expected performance.

OK.  The reduced constraints are good.  Does this actually affect
performance?  In my tests it doesn't, so I assume we still need to
change the caching decision to notice any performance improvements?


No performance gain plan with that patch indeed. It just restore support 
for non-empty hash functor that used to work with previous 
implementation. There is also no performance test impacted by the 
modification of the default cache behavior so it is not surprised that 
you noticed nothing.




(Do the performance benchmarks actually tell us anything useful?
When I run them I get such varying results it doesn't seem to be reliable.)
Last time I run the tests it was showing when not caching was better 
than caching. I have even added a bench on the unordered containers 
directly to show what are the performance of default behavior. For the 
moment, for the Foo type used in 54075.cc, the default behavior is not 
the best one. But I will submit a patch for that soon with a hash traits 
telling if it is fast or not, like we already talk about.


François

Index: include/bits/hashtable_policy.h
===
--- include/bits/hashtable_policy.h	(revision 195515)
+++ include/bits/hashtable_policy.h	(working copy)
@@ -1,6 +1,6 @@
 // Internal policy header for unordered_set and unordered_map -*- C++ -*-
 
-// Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
+// Copyright (C) 2010-2013 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -202,7 +202,7 @@
   template
 struct _Node_iterator_base
 {
-  typedef _Hash_node<_Value, _Cache_hash_code>	__node_type;
+  using __node_type = _Hash_node<_Value, _Cache_hash_code>;
 
   __node_type*  _M_cur;
 
@@ -282,7 +282,7 @@
 struct _Node_const_iterator
 : public _Node_iterator_base<_Value, __cache>
 {
- private:
+private:
   using __base_type = _Node_iterator_base<_Value, __cache>;
   using __node_type = typename __base_type::__node_type;
 
@@ -941,6 +941,17 @@
 };
 
   /**
+   *  Primary class template _Local_iterator_base.
+   *
+   *  Base class for local iterators, used to iterate within a bucket
+   *  but not between buckets.
+   */
+  template
+struct _Local_iterator_base;
+
+  /**
*  Primary class template _Hash_code_base.
*
*  Encapsulates two policy issues that aren't quite orthogonal.
@@ -974,8 +985,8 @@
   private _Hashtable_ebo_helper<1, _Hash>
 {
 private:
-  typedef _Hashtable_ebo_helper<0, _ExtractKey> 	_EboExtractKey;
-  typedef _Hashtable_ebo_helper<1, _Hash> 		_EboHash;
+  using __ebo_extract_k

Re: [v3] Fix management of non empty hash functor

2013-01-28 Thread Jonathan Wakely
On 28 January 2013 21:08, François Dumont wrote:
>>
>> (Do the performance benchmarks actually tell us anything useful?
>> When I run them I get such varying results it doesn't seem to be
>> reliable.)
>
> Last time I run the tests it was showing when not caching was better than
> caching.

Yes, I've definitely seen real advantage from not caching (but that
was in my own tests, not the performance testsuite.)

> I have even added a bench on the unordered containers directly to
> show what are the performance of default behavior. For the moment, for the
> Foo type used in 54075.cc, the default behavior is not the best one. But I
> will submit a patch for that soon with a hash traits telling if it is fast
> or not, like we already talk about.

Great, thanks.


Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld

2013-01-28 Thread Mike Stump
On Jan 28, 2013, at 7:07 AM, David Edelsohn  wrote:
> Over the weekend, I successfully tested a different way to configure
> and build: all static libraries.

Yeah, I think our build instructions for the dependent libraries should say to 
build them statically.


Re: [Patch, fortran] PR56008 (and PR47517) [F03] wrong code with lhs-realloc on assignment with derived types having allocatable components

2013-01-28 Thread Thomas Koenig

Hi Paul,


This patch is sufficiently straightforward that the ChangeLog entry
describes it completely.  The fix for both bugs lay in the
nullification of the allocatable components of the newly (re)allocated
array.


I think this fix is OK for trunk, for the reasons you mentioned.  I also
think it is straightforward enough (bordering on the obvious, but only
after having read it :-) that it does not carry too much risk of a
regression.

So yes, OK from my side, unless somebody speaks up really quickly.

Thomas


[PATCH 0/2] Avoid duplicated instrumentation in Address Sanitizer

2013-01-28 Thread Dodji Seketeli
Hello,

As the subject suggests, the little patch-set that follows this
message implements a basic optimization for the Address Sanitizer
pass: in the same basic block, it avoids instrumenting an access to a
memory region, if that same access has been instrumented before.

As we store instrumented accesses to memory region in a hash table
(that uses the new hash-table.h interface), I found it handy to be
able to define the hash table entries as a type that has obvious
constructors, rather than requiring the user of the hash table entry
to write boilerplate code to do the initialization.  So it was handy
as well to be able to use the new operator to allocate memory for
these entries (rather than using malloc + boilerplate initialization
code).  So I added support for the having hash table entries managed
by new/delete in hash-table.h.  That's what the first patch does.

The second patch is where the real meat of the set is.  I deliberately
chose to start with the same conservative (and simple) approach used
by asan@llvm which is to clear the hash table containing the
already-instrumented memory accesses each time we start a new BB or
each time we come across a function call.  It seems like we could be
smarter than that, to allow this optimization to work in inter-BB
cases where there is a dominator relationship between BBs containing
duplicated memory accesses.  But I thought this could be added later,
after 4.8.


Below is the summary of the patches.

  [asan] Allow creating/deleting hash table entries with new/delete
  [asan] Avoid instrumenting duplicated memory access in the same basic block

 gcc/Makefile.in|   3 +-
 gcc/asan.c | 366 ++---
 gcc/hash-table.h   |  16 +
 .../asan/no-redundant-instrumentation-1.c  |  70 
 4 files changed, 409 insertions(+), 46 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c



-- 
Dodji


Re: [Patch, Fortran] PR 54107: [4.8 Regression] Memory hog with abstract interface

2013-01-28 Thread Thomas Koenig

Hi Janus,


Or maybe wait for the fix for comment #4?

Rather not (technically it's a separate issue, I guess).


While the patch is rather large, I think it is OK.

One request:  Could you add a comment to gfc_sym_get_dummy_args
explaining what the function does and under which conditions sym->formal
is NULL, while sym->ts.interface->formal isn't?

Regards

Thomas


[PATCH 1/2] [asan] Allow creating/deleting hash table entries with new/delete

2013-01-28 Thread Dodji Seketeli
Hello,

The hash table type can handle creation and removal of entries with
malloc/free.  This patchlet adds support for using new/delete.  It's
useful for hash table entry types that have constructors (and/or
destructors), to prevent the user from having to type boilerplate code
to initialize them over and over again.  This is used by the patch that
follows this one.

gcc/

* hash-table.h (struct typed_delete_remove): New type.
(typed_delete_remove::remove): Implement this using the delete
operator.
---
 gcc/hash-table.h | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 206423d..884840c 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -235,6 +235,22 @@ typed_free_remove ::remove (Type *p)
   free (p);
 }
 
+/* Helpful type for removing entries with the delete operator.  */
+
+template 
+struct typed_delete_remove
+{
+  static inline void remove (Type *p);
+};
+
+/* Remove with delete.  */
+
+template 
+inline void
+typed_delete_remove ::remove (Type *p)
+{
+  delete p;
+}
 
 /* Helpful type for a no-op remove.  */
 
-- 
1.7.11.7



-- 
Dodji


[PATCH 2/2] [asan] Avoid instrumenting duplicated memory access in the same basic block

2013-01-28 Thread Dodji Seketeli
Hello,

Like what Address Sanitizer does in LLVM, this patch avoids instrumented
duplicated memory accesses in the same basic blocks.

The approach taken is very conservative, to keep the pass simple, for
a start.

A memory access is considered to be a triplet made of an expression
tree representing the beginning of the memory region that is accessed,
an expression tree representing the length of that memory region, and
a boolean that says whether that access is a load or a store.

The patch builds a hash table of the memory accesses that have been
instrumented in the current basic block.  Then it walks the gimple
statements of the current basic block.  For each statement, it tests
if the memory regions it references have already been instrumented.
If not, the statement is instrumented and each memory references that
are actually instrumented are added to the hash table.

When the patch crosses a function call that is not a built-in function
that we ought to instrument, the hash table is cleared, because that
function call can possibly e.g free some memory that was instrumented.

Likewise, when a new basic block is visited, the hash table is
cleared.  I guess we could be smarter than just unconditionally
clearing the hash table in this later case, but this is what asan@llvm
does, and for now, I thought starting in a conservative manner might
have some value.

The hash table is destroyed at the end of the pass.

Bootstrapped and tested against trunk on x86-64-unknown-linux-gnu.

gcc/
* Makefile.in (asan.o): Add new dependency on hash-table.h
* asan.c (struct mem_ref, struct mem_ref_hasher): New types.
(get_mem_ref_hash_table, has_stmt_been_instrumented_p)
(update_mem_ref_hash_table, get_mem_ref_of_assignment): New
functions.
(get_mem_refs_of_builtin_call): Extract from
instrument_builtin_call and tweak a little bit to make it fit with
the new signature.
(instrument_builtin_call): Use the new
get_mem_refs_of_builtin_call.
(maybe_instrument_assignment): Renamed instrument_assignment into
this, and change it to advance the iterator when instrumentation
actually happened and return true in that case.  This makes it
homogeneous with maybe_instrument_assignment, and thus give a
chance to callers to be more 'regular'.
(transform_statements): Clear the memory reference hash table
whenever we enter a new BB, when we cross a function call, or when
we are done transforming statements.  Use
maybe_instrument_assignment instead of instrumentation.  No more
need to special case maybe_instrument_assignment and advance the
iterator after calling it; it's now handled just like
maybe_instrument_call.  Update comment.

gcc/testsuite/

* c-c++-common/asan/no-redundant-instrumentation-1.c: New test.
---
 gcc/Makefile.in|   3 +-
 gcc/asan.c | 368 ++---
 .../asan/no-redundant-instrumentation-1.c  |  70 
 3 files changed, 395 insertions(+), 46 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6fe6345..8f7d122 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2226,7 +2226,8 @@ stor-layout.o : stor-layout.c $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
 asan.o : asan.c asan.h $(CONFIG_H) $(SYSTEM_H) $(GIMPLE_H) \
output.h coretypes.h $(GIMPLE_PRETTY_PRINT_H) \
tree-iterator.h $(TREE_FLOW_H) $(TREE_PASS_H) \
-   $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h
+   $(TARGET_H) $(EXPR_H) $(OPTABS_H) $(TM_P_H) langhooks.h \
+   $(HASH_TABLE_H)
 tsan.o : $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(TREE_INLINE_H) \
$(GIMPLE_H) $(DIAGNOSTIC_H) langhooks.h \
$(TM_H) coretypes.h $(TREE_DUMP_H) $(TREE_PASS_H) $(CGRAPH_H) $(GGC_H) \
diff --git a/gcc/asan.c b/gcc/asan.c
index f05e36c..f9a832f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "output.h"
 #include "tm_p.h"
 #include "langhooks.h"
+#include "hash-table.h"
 
 /* AddressSanitizer finds out-of-bounds and use-after-free bugs
with <2x slowdown on average.
@@ -212,6 +213,159 @@ alias_set_type asan_shadow_set = -1;
alias set is used for all shadow memory accesses.  */
 static GTY(()) tree shadow_ptr_types[2];
 
+/* Hashtable support for memory references used by gimple
+   statements.  */
+
+/* This type represents a reference to a memory region.  */
+struct __attribute__ ((visibility ("hidden"))) mem_ref
+{
+  /* The expression of the begining of the memory region.  */
+  tree start;
+  /* The expression representing the length of the region.  */
+  tree len;
+  /* This is true iff the memory reference is a store.  */
+  bool is_store;
+
+  /* Constructors.  */
+  mem_ref () : start (NULL_TREE), len (NULL_T

Re: [PATCH, regression?] Support --static-libstdc++ with native AIX ld

2013-01-28 Thread David Edelsohn
On Mon, Jan 28, 2013 at 4:17 PM, Mike Stump  wrote:
> On Jan 28, 2013, at 7:07 AM, David Edelsohn  wrote:
>> Over the weekend, I successfully tested a different way to configure
>> and build: all static libraries.
>
> Yeah, I think our build instructions for the dependent libraries should say 
> to build them statically.

A number of GCC developers who already do this.

I previously had problem on AIX with an earlier release of GCC that
was built with Graphite because one of the dependent libraries used
C++, but GCC was not linked with libstdc++ at the time. The only way
to break the C++ dependency of the library separate from GCC was
through shared libraries.

If one can link GCC against static libraries, it definitely simplifies
things and avoids potential conflicts with multiple versions of GCC
installed.

- David


Re: [PATCH] Adding target rdos to GCC

2013-01-28 Thread Leif Ekblad

That looks good. Thanks, Uros.

Leif


- Original Message - 
From: "Uros Bizjak" 

To: "Leif Ekblad" 
Cc: "Richard Biener" ; ; "H.J. 
Lu" ; "Jakub Jelinek" 

Sent: Monday, January 28, 2013 9:45 PM
Subject: Re: [PATCH] Adding target rdos to GCC



On Mon, Jan 28, 2013 at 9:14 PM, Leif Ekblad  wrote:

That is intentional. The gthr-rdos.h file is part of libgcc. My 
intention

was to first patch gcc, then update the patches for newlib, and finally
libgcc. The gthr-rdos.h file would reference include-files part of
newlib,
so this is kind of circular. I also cannot define the thread model for
RDOS
unless I define this file.

I see a couple of possible solutions:
1. Keep as is. You cannot build libgcc at the current stage anyway, and
the
bootstrap must be built without threading
2. Add an empty gthr-rdos.h file until libgcc is done
3. Remove the threading-model for now, and add it with libgcc instead.



I propose option 3.

Is it enough to remove gthr.m4 change from the patch in this case?


Yes, for all practical purposes. There is a reference to thread-file in
config.gcc, when threading is enabled, which doesn't work for 
bootstrapping

the compiler anyway.


Thanks for pointing it, I have also removed this reference.

Attached is the patch that has been committed to SVN. I have added
missing licence headers to new files and clean whitespace a bit.

2013-01-28  Leif Ekblad  

* config.gcc (i[34567]86-*-rdos*, x86_64-*-rdos*): New targets.
* config/i386/i386.h (TARGET_RDOS): New macro.
(DEFAULT_LARGE_SECTION_THRESHOLD): New macro.
* config/i386/i386.c (ix86_option_override_internal): For 64bit
TARGET_RDOS, set ix86_cmodel to CM_MEDIUM_PIC and flag_pic to 1.
* config/i386/i386.opt (mlarge-data-threshold): Initialize to
DEFAULT_LARGE_SECTION_THRESHOLD.
* config/i386/i386.md (R14_REG, R15_REG): New constants.
* config/i386/rdos.h: New file.
* config/i386/rdos64.h: New file.

Thanks,
Uros.





Re: [PATCH 1/2] [asan] Allow creating/deleting hash table entries with new/delete

2013-01-28 Thread Lawrence Crowl
On 1/28/13, Dodji Seketeli  wrote:
> Hello,
>
> The hash table type can handle creation and removal of entries with
> malloc/free.  This patchlet adds support for using new/delete.  It's
> useful for hash table entry types that have constructors (and/or
> destructors), to prevent the user from having to type boilerplate code
> to initialize them over and over again.  This is used by the patch that
> follows this one.

Looks good to me.

>
> gcc/
>
>   * hash-table.h (struct typed_delete_remove): New type.
>   (typed_delete_remove::remove): Implement this using the delete
>   operator.
> ---
>  gcc/hash-table.h | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index 206423d..884840c 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -235,6 +235,22 @@ typed_free_remove ::remove (Type *p)
>free (p);
>  }
>
> +/* Helpful type for removing entries with the delete operator.  */
> +
> +template 
> +struct typed_delete_remove
> +{
> +  static inline void remove (Type *p);
> +};
> +
> +/* Remove with delete.  */
> +
> +template 
> +inline void
> +typed_delete_remove ::remove (Type *p)
> +{
> +  delete p;
> +}
>
>  /* Helpful type for a no-op remove.  */
>
> --
> 1.7.11.7
>
>
>
> --
>   Dodji
>


-- 
Lawrence Crowl


Re: question about section 10.12

2013-01-28 Thread Hans-Peter Nilsson
> From: Kenneth Zadeck 
> Date: Mon, 28 Jan 2013 02:02:41 +0100

> this looks good to me.  does your patch also address the vec_concat 
> issue that marc raised?

You mean the issue being "same thing there"?  I can confirm that
(I've stumbled upon) the same issue being there (i.e. similarly
applies to scalars).  But nope, there's no cross-reference, so
the effective wording needs to be added there too.  I also
noticed the parameter/s misleadingly being keyed to "vec" and
fill-paragraphed the paragraph.  Something like this seems
obvious:

(Oops, noticed gcc@ was in CC, changing to gcc-patches@.)

* doc/rtl.texi (vec_concat, vec_duplicate): Mention that
scalars are valid operands.

Index: doc/rtl.texi
===
--- doc/rtl.texi(revision 195514)
+++ doc/rtl.texi(working copy)
@@ -2627,17 +2627,18 @@ The result mode @var{m} is either the su
 with that element submode (if multiple subparts are selected).
 
 @findex vec_concat
-@item (vec_concat:@var{m} @var{vec1} @var{vec2})
+@item (vec_concat:@var{m} @var{x1} @var{x2})
 Describes a vector concat operation.  The result is a concatenation of the
-vectors @var{vec1} and @var{vec2}; its length is the sum of the lengths of
-the two inputs.
+vectors or scalars @var{x1} and @var{x2}; its length is the sum of the
+lengths of the two inputs.
 
 @findex vec_duplicate
-@item (vec_duplicate:@var{m} @var{vec})
-This operation converts a small vector into a larger one by duplicating the
-input values.  The output vector mode must have the same submodes as the
-input vector mode, and the number of output parts must be an integer multiple
-of the number of input parts.
+@item (vec_duplicate:@var{m} @var{x})
+This operation converts a scalar into a vector or a small vector into a
+larger one by duplicating the input values.  The output vector mode must have
+the same submodes as the input vector mode or the scalar modes, and the
+number of output parts must be an integer multiple of the number of input
+parts.
 
 @end table
 
brgds, H-P


Re: [SH] PR 56121 - fix libgcc build for SH2A

2013-01-28 Thread Kaz Kojima
Oleg Endo  wrote:
> This is the same patch that I attached in the PR.
> It fixes an ICE when building libgcc for the SH2A target.
> Tested on rev. 195493 with
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> ... comparing the test results against rev 193342 shows a few new
> failures, but they seem unrelated to this case.
> 
> OK for trunk?

OK.

Regards,
kaz


[4.9 PATCH, alpha]: Switch alpha to LRA

2013-01-28 Thread Uros Bizjak
Hello!

2013-01-28  Uros Bizjak  

* config/alpha/alpha.c (TARGET_LRA_P): New define.

Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu.

OK for 4.9?

[1] http://gcc.gnu.org/ml/gcc-testresults/2013-01/msg02998.html

Uros.

Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 195502)
+++ config/alpha/alpha.c(working copy)
@@ -9872,6 +9872,9 @@
 #undef TARGET_LEGITIMATE_ADDRESS_P
 #define TARGET_LEGITIMATE_ADDRESS_P alpha_legitimate_address_p

+#undef TARGET_LRA_P
+#define TARGET_LRA_P hook_bool_void_true
+
 #undef TARGET_CONDITIONAL_REGISTER_USAGE
 #define TARGET_CONDITIONAL_REGISTER_USAGE alpha_conditional_register_usage


Re: [4.9 PATCH, alpha]: Switch alpha to LRA

2013-01-28 Thread Richard Henderson

On 01/28/2013 03:14 PM, Uros Bizjak wrote:

2013-01-28  Uros Bizjak

* config/alpha/alpha.c (TARGET_LRA_P): New define.

Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu.

OK for 4.9?



Yep.


r~


gccgo patch committed: Fix initialization order bug

2013-01-28 Thread Ian Lance Taylor
This patch to the Go frontend fixes a bug determining the initialization
order.  I've committed a test case for the bug to the master Go
repository:

https://code.google.com/p/go/source/detail?spec=svn921e53d4863c8827756c0e7b228ab210441e4032&r=c3155f9f1bb64c8b3adb2b6f5527895d51f83b74

The old algorithm was overly clever and frankly I don't know what I was
thinking.  The new algorithm is simpler and probably more efficient.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r ee18ff1199b6 go/expressions.h
--- a/go/expressions.h	Fri Jan 25 16:13:13 2013 -0800
+++ b/go/expressions.h	Mon Jan 28 16:23:26 2013 -0800
@@ -983,6 +983,11 @@
   statement_(statement), is_lvalue_(false)
   { }
 
+  // The temporary that this expression refers to.
+  Temporary_statement*
+  statement() const
+  { return this->statement_; }
+
   // Indicate that this reference appears on the left hand side of an
   // assignment statement.
   void
diff -r ee18ff1199b6 go/gogo-tree.cc
--- a/go/gogo-tree.cc	Fri Jan 25 16:13:13 2013 -0800
+++ b/go/gogo-tree.cc	Mon Jan 28 16:23:26 2013 -0800
@@ -499,7 +499,7 @@
   // A hash table we use to avoid looping.  The index is the name of a
   // named object.  We only look through objects defined in this
   // package.
-  typedef Unordered_set(std::string) Seen_objects;
+  typedef Unordered_set(const void*) Seen_objects;
 
   Find_var(Named_object* var, Seen_objects* seen_objects)
 : Traverse(traverse_expressions),
@@ -547,7 +547,7 @@
 	  if (init != NULL)
 	{
 	  std::pair ins =
-		this->seen_objects_->insert(v->name());
+		this->seen_objects_->insert(v);
 	  if (ins.second)
 		{
 		  // This is the first time we have seen this name.
@@ -568,7 +568,7 @@
   if (f->is_function() && f->package() == NULL)
 	{
 	  std::pair ins =
-	this->seen_objects_->insert(f->name());
+	this->seen_objects_->insert(f);
 	  if (ins.second)
 	{
 	  // This is the first time we have seen this name.
@@ -578,6 +578,25 @@
 	}
 }
 
+  Temporary_reference_expression* tre = e->temporary_reference_expression();
+  if (tre != NULL)
+{
+  Temporary_statement* ts = tre->statement();
+  Expression* init = ts->init();
+  if (init != NULL)
+	{
+	  std::pair ins =
+	this->seen_objects_->insert(ts);
+	  if (ins.second)
+	{
+	  // This is the first time we have seen this temporary
+	  // statement.
+	  if (Expression::traverse(&init, this) == TRAVERSE_EXIT)
+		return TRAVERSE_EXIT;
+	}
+	}
+}
+
   return TRAVERSE_CONTINUE;
 }
 
@@ -613,11 +632,11 @@
 {
  public:
   Var_init()
-: var_(NULL), init_(NULL_TREE), waiting_(0)
+: var_(NULL), init_(NULL_TREE)
   { }
 
   Var_init(Named_object* var, tree init)
-: var_(var), init_(init), waiting_(0)
+: var_(var), init_(init)
   { }
 
   // Return the variable.
@@ -630,24 +649,11 @@
   init() const
   { return this->init_; }
 
-  // Return the number of variables waiting for this one to be
-  // initialized.
-  size_t
-  waiting() const
-  { return this->waiting_; }
-
-  // Increment the number waiting.
-  void
-  increment_waiting()
-  { ++this->waiting_; }
-
  private:
   // The variable being initialized.
   Named_object* var_;
   // The initialization expression to run.
   tree init_;
-  // The number of variables which are waiting for this one.
-  size_t waiting_;
 };
 
 typedef std::list Var_inits;
@@ -660,6 +666,10 @@
 static void
 sort_var_inits(Gogo* gogo, Var_inits* var_inits)
 {
+  typedef std::pair No_no;
+  typedef std::map Cache;
+  Cache cache;
+
   Var_inits ready;
   while (!var_inits->empty())
 {
@@ -670,23 +680,30 @@
   Named_object* dep = gogo->var_depends_on(var->var_value());
 
   // Start walking through the list to see which variables VAR
-  // needs to wait for.  We can skip P1->WAITING variables--that
-  // is the number we've already checked.
+  // needs to wait for.
   Var_inits::iterator p2 = p1;
   ++p2;
-  for (size_t i = p1->waiting(); i > 0; --i)
-	++p2;
 
   for (; p2 != var_inits->end(); ++p2)
 	{
 	  Named_object* p2var = p2->var();
-	  if (expression_requires(init, preinit, dep, p2var))
+	  No_no key(var, p2var);
+	  std::pair ins =
+	cache.insert(std::make_pair(key, false));
+	  if (ins.second)
+	ins.first->second = expression_requires(init, preinit, dep, p2var);
+	  if (ins.first->second)
 	{
 	  // Check for cycles.
-	  if (expression_requires(p2var->var_value()->init(),
+	  key = std::make_pair(p2var, var);
+	  ins = cache.insert(std::make_pair(key, false));
+	  if (ins.second)
+		ins.first->second =
+		  expression_requires(p2var->var_value()->init(),
   p2var->var_value()->preinit(),
   gogo->var_depends_on(p2var->var_value()),
-  var))
+  var);
+	  if (ins.first->second)
 		{
 		  error_at(var->location(),
 			   ("initialization expressions for %qs and "
@@ -700,12 +717,8 @@
 	  else
 		{
 		  /

Re: [4.9 PATCH, alpha]: Switch alpha to LRA

2013-01-28 Thread Jeff Law

On 01/28/2013 04:14 PM, Uros Bizjak wrote:

Hello!

2013-01-28  Uros Bizjak  

* config/alpha/alpha.c (TARGET_LRA_P): New define.

Bootstrapped and regression tested [1] on alphaev68-unknown-linux-gnu.

OK for 4.9?

[1] http://gcc.gnu.org/ml/gcc-testresults/2013-01/msg02998.html

Can you attach this to PR 55996, the 4.9 pending patches metabug.

Thanks,
jeff



Re: [Patch,avr] Remove fixed-point MUL and DIV routines from libgcc build

2013-01-28 Thread Denis Chertykov
2013/1/28 Georg-Johann Lay :
> This removes modules from libgcc that are already supported by avr-specific
> fixed-point implementation and avoids duplicate functions like __mulsa3.
>
> Ok for trunk?
>
> Johann
>
>
> libgcc/
> * config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add:
> _mulQQ,  _mulHQ,  _mulHA,  _mulSA,
> _mulUQQ, _mulUHQ, _mulUHA, _mulUSA,
> _divQQ,  _divHQ,  _divHA,  _divSA,
> _divUQQ, _divUHQ, _divUHA, _divUSA.

Approved.

Denis.


Ping: [Patch] PR56064: Fold VIEW_CONVERT_EXPR with FIXED_CST

2013-01-28 Thread Georg-Johann Lay

Ping #1 for

http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01053.html

Release Manager approval is here:

http://gcc.gnu.org/ml/gcc/2013-01/msg00222.html


This is tentative patch as discussed in

http://gcc.gnu.org/ml/gcc/2013-01/msg00187.html

fold-const.c gets 2 new function native_encode_fixed and
native_interpret_fixed.  Code common with the integer case is factored out and
moved to the new constructor-like function double_int::from_buffer.

The code bootstraps fine on x86-linux-gnu and I have test coverage from
avr-unknown-none.

Ok to apply?

There are less intrusive solutions that only handle the int <-> fixed cases,
for example fold-const.c:fold_view_convert_expr() could test for these cases
and use double_int directly without serializing / deserializing through a
memory buffer.

Johann


PR tree-optimization/56064
* fixed-value.c (const_fixed_from_double_int): New function.
* fixed-value.h (const_fixed_from_double_int): New prototype.
* fold-const.c (native_interpret_fixed): New static function.
(native_interpret_expr) : Use it.
(can_native_interpret_type_p) : Return true.
(native_encode_fixed): New static function.
(native_encode_expr) : Use it.
(native_interpret_int): Move double_int worker code to...
* double-int.c (double_int::from_buffer): ...this new static method.
* double-int.h (double_int::from_buffer): Prototype it.

testsuite/
PR tree-optimization/56064
* gcc.dg/fixed-point/view-convert.c: New test.



Index: fixed-value.c
===
--- fixed-value.c   (revision 195301)
+++ fixed-value.c   (working copy)
@@ -81,6 +81,24 @@ check_real_for_fixed_mode (REAL_VALUE_TY
   return FIXED_OK;
 }
 
+

+/* Construct a CONST_FIXED from a bit payload and machine mode MODE.
+   The bits in PAYLOAD are used verbatim.  */
+
+FIXED_VALUE_TYPE
+const_fixed_from_double_int (double_int payload, enum machine_mode mode)
+{
+  FIXED_VALUE_TYPE value;
+
+  gcc_assert (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_DOUBLE_INT);
+
+  value.data = payload;
+  value.mode = mode;
+
+  return value;
+}
+
+
 /* Initialize from a decimal or hexadecimal string.  */
 
 void

Index: fixed-value.h
===
--- fixed-value.h   (revision 195301)
+++ fixed-value.h   (working copy)
@@ -49,6 +49,11 @@ extern FIXED_VALUE_TYPE fconst1[MAX_FCON
   const_fixed_from_fixed_value (r, m)
 extern rtx const_fixed_from_fixed_value (FIXED_VALUE_TYPE, enum machine_mode);
 
+/* Construct a CONST_FIXED from a bit payload and machine mode MODE.

+   The bits in PAYLOAD are used verbatim.  */
+extern FIXED_VALUE_TYPE const_fixed_from_double_int (double_int,
+enum machine_mode);
+
 /* Initialize from a decimal or hexadecimal string.  */
 extern void fixed_from_string (FIXED_VALUE_TYPE *, const char *,
   enum machine_mode);
Index: fold-const.c
===
--- fold-const.c(revision 195301)
+++ fold-const.c(working copy)
@@ -7200,6 +7200,36 @@ native_encode_int (const_tree expr, unsi
 }
 
 
+/* Subroutine of native_encode_expr.  Encode the FIXED_CST

+   specified by EXPR into the buffer PTR of length LEN bytes.
+   Return the number of bytes placed in the buffer, or zero
+   upon failure.  */
+
+static int
+native_encode_fixed (const_tree expr, unsigned char *ptr, int len)
+{
+  tree type = TREE_TYPE (expr);
+  enum machine_mode mode = TYPE_MODE (type);
+  int total_bytes = GET_MODE_SIZE (mode);
+  FIXED_VALUE_TYPE value;
+  tree i_value, i_type;
+
+  if (total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
+return 0;
+
+  i_type = lang_hooks.types.type_for_size (GET_MODE_BITSIZE (mode), 1);
+
+  if (NULL_TREE == i_type
+  || TYPE_PRECISION (i_type) != total_bytes)
+return 0;
+  
+  value = TREE_FIXED_CST (expr);

+  i_value = double_int_to_tree (i_type, value.data);
+
+  return native_encode_int (i_value, ptr, len);
+}
+
+
 /* Subroutine of native_encode_expr.  Encode the REAL_CST
specified by EXPR into the buffer PTR of length LEN bytes.
Return the number of bytes placed in the buffer, or zero
@@ -7345,6 +7375,9 @@ native_encode_expr (const_tree expr, uns
 case REAL_CST:
   return native_encode_real (expr, ptr, len);
 
+case FIXED_CST:

+  return native_encode_fixed (expr, ptr, len);
+
 case COMPLEX_CST:
   return native_encode_complex (expr, ptr, len);
 
@@ -7368,44 +7401,37 @@ static tree

 native_interpret_int (tree type, const unsigned char *ptr, int len)
 {
   int total_bytes = GET_MODE_SIZE (TYPE_MODE (type));
-  int byte, offset, word, words;
-  unsigned char value;
   double_int result;
 
-  if (total_bytes > len)

-return NULL_TREE;
-  if (total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
+  if (total_b