Re: [GCC 4.8 wwwdocs] PATCH: Mention several user-visible changes for x86

2013-02-20 Thread Kirill Yukhin
Hi,
Checked in

Thanks, K

On Mon, Feb 18, 2013 at 10:28 AM, Igor Zamyatin  wrote:
> Gerald,
>
> Thanks a lot for your remarks!
>
> Below is updated patch which will be checked in.
>
>
> Thanks,
> Igor
>
> On Mon, Feb 18, 2013 at 3:07 AM, Gerald Pfeifer  wrote:
>> On Fri, 15 Feb 2013, Igor Zamyatin wrote:
>>> Is it ok for wwwdocs?
>>
>> Index: htdocs/gcc-4.8/changes.html
>> ===
>> + Support for the new Intel processor codename Broadwell with RDSEED,
>> + ADCX, ADOX, PREFETCHW is available through -madx,
>> + -mprfchw, -mrdseed.
>>
>> Can you make this RDSEED, ... and so forth?
>> (This is a bit borderline, in that one could also see this as
>> more general references, but usually we mark those up.)
>>
>> And "...through the ... command-line options."?
>>
>> +  Support for Intel RTM and HLE intrinsics, built-in
>>
>> "the ... intrinsics" (and same below for "instruction sets")
>>
>> +  x86 backend was improved to allow option
>> -fscedule-insns to work reliably.
>>
>> "The x86 backend has been improved..."
>>
>> + This option can be used to schedule instructions better and can
>> lead to improved performace in certain cases.
>>
>> This line is quite long, can you break lines around 76 columns?
>>
>> And, let's be a bit more brave and omit either "can" or "in certain
>> cases".  Otherwise this may sounds too unlikely. :-)
>>
>> The patch is fine with changes along the lines described above.
>>
>> Thanks,
>> Gerald
>
> Index: htdocs/gcc-4.8/changes.html
>   ===
>   RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
>   retrieving revision 1.95
>   diff -c -r1.95 changes.html
>   *** htdocs/gcc-4.8/changes.html 11 Feb 2013 15:12:58 -  1.95
>   --- htdocs/gcc-4.8/changes.html 12 Feb 2013 15:10:41 -
>   ***
>   *** 460,465 
>   --- 460,471 
> wrong results.  You must build all
> modules with -mpreferred-stack-boundary=3, including any
> libraries.  This includes the system libraries and startup 
> modules.
>   + Support for the new Intel processor codename Broadwell with
>   + RDSEED, ADCX, ADOX,
>   + PREFETCHW is available through -madx,
>   + -mprfchw, -mrdseed command-line options.
>   + 
>   +  Support for the Intel RTM and HLE intrinsics, built-in
>   functions and code generation is available via -mrtm and
>   -mhle.
>   + 
>   +  Support for the Intel FXSR, XSAVE and XSAVEOPT instruction
>  sets. Intrinsics and built-in functions are available
>   + via -mfxsr, -mxsave and
>  -mxsaveopt respectively.
>   + 
>  New built-in functions to detect run-time CPU type and ISA:
> 
>   A built-in function __builtin_cpu_is has
> been added to
>   ***
>   *** 524,529 
>   --- 530,538 
> http://gcc.gnu.org/wiki/FunctionMultiVersioning";>wiki
>   for more
> information.
> 
>   +  The x86 backend has been improved to allow option
>  -fscedule-insns to work reliably.
>   + This option can be used to schedule instructions better and leads to
>   + improved performace in certain cases.
>   + 
>  Windows MinGW-w64 targets (*-w64-mingw*)
>   require at least r5437 from the Mingw-w64 trunk. 
>   


Re: [PATCH, x86, AVX2] FP reassociation enabling for AVX2 targets

2013-02-20 Thread Kirill Yukhin
> OK (it is a tuning patch).
>
Hi,
Checked in: http://gcc.gnu.org/ml/gcc-cvs/2013-02/msg00540.html

Thanks, K


Re: Speedup recognizing multi-letter constraints

2013-02-20 Thread Richard Biener
On Tue, Feb 19, 2013 at 4:10 PM, Michael Matz  wrote:
> Hi,
>
> from IRC:
> "[15:45:21]  ick - lookup_constraint for multi-letter constraints
> is quite expensive ... strncmp is not expanded inline for some reason"
>
> Instead of fiddling with strncmp inlining, simply generate better code
> from the start for two character constraints:
>
>   switch (str[0]) {
> case 'Y':
>   switch (str[1])
> {
> case 'i':
>   return CONSTRAINT_Yi;
> case 'm':
>   return CONSTRAINT_Ym;
>
> ...

Bootstrapped and tested on ... ?

I suppose this is ok, even ontop of my recent improvement (which
we noticed can be improved further by using memcmp instead of
strncmp).  We seem to have at most seven-letter constraints at the moment
(rx port - they seem to use descriptive constraint names like "NEGint4" and
"Symbol" with only 10 constraints in total ...).  I wonder where the cut-off
is for expanding the whole comparison to nested switch statements ...
or even expand the 2nd level to a switch on properly masked short /
int / long compares.

Richard.

>
> Ciao,
> Michael.
> --
> * genpreds (write_lookup_constraint): Special case two-character
> constraints to also expand to a switch.
>
> Index: genpreds.c
> ===
> --- genpreds.c  (revision 196053)
> +++ genpreds.c  (working copy)
> @@ -941,6 +941,22 @@ write_lookup_constraint (void)
>printf ("case '%c':\n", i);
>if (c->namelen == 1)
> printf ("  return CONSTRAINT_%s;\n", c->c_name);
> +  else if (c->namelen == 2)
> +   {
> + puts ("  switch (str[1])\n"
> +   "{");
> + do
> +   {
> + printf ("case '%c':\n"
> + "  return CONSTRAINT_%s;\n",
> + c->name[1], c->c_name);
> + c = c->next_this_letter;
> +   }
> + while (c);
> + puts ("default: break;\n"
> +   "}\n"
> +   "  break;");
> +   }
>else
> {
>   do


Re: [PATCH] Fix ccp (PR tree-optimization/56396)

2013-02-20 Thread Richard Biener
On Tue, 19 Feb 2013, Jakub Jelinek wrote:

> Hi!
> 
> On the following patch gcc ICEs because malloc memory is corrupted.
> The problem is that const_val array is allocated at the start of the pass,
> but during the execution of ccp some new SSA_NAMEs are created
> (update_call_from_tree if I remember well).
> 
> The following patch fixes that by making const_val a vector instead of
> array, and growing it when needed.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ick.  This happens from calling fold_stmt during substitute_and_fold,
a place where increasing the lattice isn't useful.  So I'd prefer
the same fix as present in VRP for this issue, simply return VARYING
for out-of-bound accesses.  Same issue is latent for copyprop.

I'm testing the following.

Richard.

2013-02-20  Richard Biener  
Jakub Jelinek  

PR tree-optimization/56396
* tree-ssa-ccp.c (n_const_val): New static variable.
(get_value): Return NULL for SSA names we don't have a lattice
entry for.
(ccp_initialize): Initialize n_const_val.
* tree-ssa-copy.c (n_copy_of): New static variable.
(init_copy_prop): Initialize n_copy_of.
(get_value): Return NULL_TREE for SSA names we don't have a
lattice entry for.

* gcc.dg/pr56396.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===
*** gcc/tree-ssa-ccp.c  (revision 196167)
--- gcc/tree-ssa-ccp.c  (working copy)
*** typedef struct prop_value_d prop_value_t
*** 162,167 
--- 162,168 
 memory reference used to store (i.e., the LHS of the assignment
 doing the store).  */
  static prop_value_t *const_val;
+ static unsigned n_const_val;
  
  static void canonicalize_float_value (prop_value_t *);
  static bool ccp_fold_stmt (gimple_stmt_iterator *);
*** get_value (tree var)
*** 295,301 
  {
prop_value_t *val;
  
!   if (const_val == NULL)
  return NULL;
  
val = &const_val[SSA_NAME_VERSION (var)];
--- 296,303 
  {
prop_value_t *val;
  
!   if (const_val == NULL
!   || SSA_NAME_VERSION (var) >= n_const_val)
  return NULL;
  
val = &const_val[SSA_NAME_VERSION (var)];
*** ccp_initialize (void)
*** 713,719 
  {
basic_block bb;
  
!   const_val = XCNEWVEC (prop_value_t, num_ssa_names);
  
/* Initialize simulation flags for PHI nodes and statements.  */
FOR_EACH_BB (bb)
--- 715,722 
  {
basic_block bb;
  
!   n_const_val = num_ssa_names;
!   const_val = XCNEWVEC (prop_value_t, n_const_val);
  
/* Initialize simulation flags for PHI nodes and statements.  */
FOR_EACH_BB (bb)
Index: gcc/tree-ssa-copy.c
===
*** gcc/tree-ssa-copy.c (revision 196167)
--- gcc/tree-ssa-copy.c (working copy)
*** struct prop_value_d {
*** 280,285 
--- 280,286 
  typedef struct prop_value_d prop_value_t;
  
  static prop_value_t *copy_of;
+ static unsigned n_copy_of;
  
  
  /* Return true if this statement may generate a useful copy.  */
*** init_copy_prop (void)
*** 664,670 
  {
basic_block bb;
  
!   copy_of = XCNEWVEC (prop_value_t, num_ssa_names);
  
FOR_EACH_BB (bb)
  {
--- 665,672 
  {
basic_block bb;
  
!   n_copy_of = num_ssa_names;
!   copy_of = XCNEWVEC (prop_value_t, n_copy_of);
  
FOR_EACH_BB (bb)
  {
*** init_copy_prop (void)
*** 728,734 
  static tree
  get_value (tree name)
  {
!   tree val = copy_of[SSA_NAME_VERSION (name)].value;
if (val && val != name)
  return val;
return NULL_TREE;
--- 730,739 
  static tree
  get_value (tree name)
  {
!   tree val;
!   if (SSA_NAME_VERSION (name) >= n_copy_of)
! return NULL_TREE;
!   val = copy_of[SSA_NAME_VERSION (name)].value;
if (val && val != name)
  return val;
return NULL_TREE;
Index: gcc/testsuite/gcc.dg/pr56396.c
===
*** gcc/testsuite/gcc.dg/pr56396.c  (revision 0)
--- gcc/testsuite/gcc.dg/pr56396.c  (working copy)
***
*** 0 
--- 1,22 
+ /* PR tree-optimization/56396 */
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -fpic -g" } */
+ 
+ struct S { char *s; int z; };
+ struct T { int t; } *c, u;
+ void bar (int, const char *);
+ 
+ inline void *
+ foo (void *x, char *y, int z)
+ {
+   struct S s;
+   char b[256];
+   s.s = b;
+   s.z = __builtin___sprintf_chk (s.s, 1, __builtin_object_size (s.s, 2), 
"Require");
+   if (s.z < 0)
+ bar (u.t | c->t, "rls");
+   if (foo (x, s.s, s.z))
+ {
+ }
+   return (void *) 0;
+ }


Re: Version specific onlinedocs page

2013-02-20 Thread Tobias Burnus

Gerald Pfeifer wrote:

On Mon, 18 Feb 2013, Tobias Burnus wrote:

How about the following patch? If it is okay, I will add the remaining
index.html for 4.6 and 4.7 and update gcc-4.6/index.html – and then
commit it.


Looks good to me.

Perhaps you can add a note for the release manager how to "create"
index.html (copy from previous release and adjust)?


DONE.



Tobias

Index: releasing.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/releasing.html,v
retrieving revision 1.39
diff -u -p -r1.39 releasing.html
--- releasing.html	23 Mar 2012 08:44:11 -	1.39
+++ releasing.html	20 Feb 2013 09:40:51 -
@@ -75,7 +75,9 @@ to generate the documentation would be <
 -rgcc_3_0_2_release -dgcc-3.0.2 (with the current version
 number inserted).  Link to it from onlinedocs/index.html
 (but don't break URLs to documentation for previous releases even if
-you remove the links to it).
+you remove the links to it).  Create additionally
+onlinedocs/version-number/index.html by copying it
+from a previous release and adjust it.
 
 Update the online-documentation links in changes.html
 to point to the online-documentation for the branch.
Index: gcc-4.6/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/index.html,v
retrieving revision 1.5
diff -u -p -r1.5 index.html
--- gcc-4.6/index.html	1 Mar 2012 11:43:25 -	1.5
+++ gcc-4.6/index.html	20 Feb 2013 09:40:51 -
@@ -22,22 +22,26 @@ in GCC 4.6.2 relative to previous releas
 
 GCC 4.6.3
 March 1, 2012
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.6.3/";>documentation)
 
 
 GCC 4.6.2
 October 26, 2011
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.6.2/";>documentation)
 
 
 GCC 4.6.1
 June 27, 2011
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.6.1/";>documentation)
 
 
 GCC 4.6.0
 March 25, 2011
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.6.0/";>documentation)
 
 
 
Index: gcc-4.7/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/index.html,v
retrieving revision 1.4
diff -u -p -r1.4 index.html
--- gcc-4.7/index.html	20 Sep 2012 06:31:19 -	1.4
+++ gcc-4.7/index.html	20 Feb 2013 09:40:51 -
@@ -22,17 +22,20 @@ in GCC 4.7.1 relative to previous releas
 
 GCC 4.7.2
 September 20, 2012
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.7.2/";>documentation)
 
 
 GCC 4.7.1
 June 14, 2012
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.7.1/";>documentation)
 
 
 GCC 4.7.0
 March 22, 2012
-(changes)
+(changes,
+ http://gcc.gnu.org/onlinedocs/4.7.0/";>documentation)
 
 
 
Index: onlinedocs/4.6.0/index.html
===
RCS file: onlinedocs/4.6.0/index.html
diff -N onlinedocs/4.6.0/index.html
--- /dev/null	1 Jan 1970 00:00:00 -
+++ onlinedocs/4.6.0/index.html	20 Feb 2013 09:40:51 -
@@ -0,0 +1,91 @@
+
+
+
+GCC 4.6.0 manuals
+
+
+
+
+4.6.0 manuals
+  
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc/";>GCC
+ 4.6.0 Manual (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc.ps.gz";>PostScript or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc-html.tar.gz";>an
+ HTML tarball)
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gfortran/";>GCC
+ 4.6.0 GNU Fortran Manual (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gfortran.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gfortran.ps.gz";>PostScript or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gfortran-html.tar.gz";>an
+ HTML tarball)
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcj/";>GCC
+ 4.6.0 GCJ Manual (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcj.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcj.ps.gz";>PostScript or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcj-html.tar.gz";>an
+ HTML tarball)
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/cpp/";>GCC 
+ 4.6.0 CPP Manual (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/cpp.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/cpp.ps.gz";>PostScript or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/cpp-html.tar.gz";>an
+ HTML tarball)
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_rm/";>GCC
+ 4.6.0 GNAT Reference Manual (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_rm.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_rm.ps.gz";>PostScript or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_rm-html.tar.gz";>an
+ HTML tarball)
+http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_ugn_unw/";>GCC
+ 4.6.0 GNAT User's Guide (http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_ugn_unw.pdf";>also
+ in PDF or http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gnat_ugn_unw.ps.gz";>PostScript or ht

Re: [PATCH] Fix ccp (PR tree-optimization/56396)

2013-02-20 Thread Jakub Jelinek
On Wed, Feb 20, 2013 at 10:52:52AM +0100, Richard Biener wrote:
> *** get_value (tree var)
> *** 295,301 
>   {
> prop_value_t *val;
>   
> !   if (const_val == NULL)
>   return NULL;
>   
> val = &const_val[SSA_NAME_VERSION (var)];
> --- 296,303 
>   {
> prop_value_t *val;
>   
> !   if (const_val == NULL
> !   || SSA_NAME_VERSION (var) >= n_const_val)
>   return NULL;

You could just use
  if (SSA_NAME_VERSION (var) >= n_const_val)

test here if upon free (const_val); you'd set n_const_val back to 0.

Jakub


[PATCH][RFC] Less TODO_remove_unused_locals

2013-02-20 Thread Richard Biener

Hunting for the "we're getting slower" bits I noticed that
TODO_remove_unused_locals is a big part of execute_function_todo
(and accounts for 1% of compile-time of ac.f90).
The following patch removes most of the remove_unused_locals
calls based on the fact that with anonymous SSA names now available
we should never create new locals (wishful thinking of course ...)
and the important places to remove unused stuff are driven by
1) avoid creating yet another copy of the unused stuff, thus do
it before inlining, on the callee;  2) avoid pinning unused memory
while we operate on other function bodies, thus, do it at the end
of non-IPA pass pipelines

In the end this asks for more explicit placement and thus a
real pass ... but the following should be enough as a RFC and
be good enough for 4.8.

We keep doing remove_unused_locals after going into SSA
and now after releasing unused SSA names (that should suffice
to achieve 1) for early and IPA inlining and for 2) pre-IPA.
post-IPA we perform it after IPA inline transform and at right
before expanding to RTL (avoid expanding unused stack vars).

I'm scheduling a bootstrap & regtest.

Ok for trunk?

Thanks,
Richard.

2013-02-20  Richard Biener  

* tree-call-cdce.c (tree_call_cdce): Do not remove unused locals.
* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Likewise.
* tree-ssa-dce.c (perform_tree_ssa_dce): Likewise.
* tree-ssa-copyrename.c (copy_rename_partition_coalesce): Do
not return anything.
(rename_ssa_copies): Do not remove unused locals.
* tree-ssa-ccp.c (do_ssa_ccp): Likewise.
* tree-ssanames.c (pass_release_ssa_names): Remove unused
locals afterwards.

Index: gcc/tree-call-cdce.c
===
*** gcc/tree-call-cdce.c(revision 196167)
--- gcc/tree-call-cdce.c(working copy)
*** tree_call_cdce (void)
*** 898,908 
/* As we introduced new control-flow we need to insert PHI-nodes
   for the call-clobbers of the remaining call.  */
mark_virtual_operands_for_renaming (cfun);
!   return (TODO_update_ssa | TODO_cleanup_cfg | TODO_ggc_collect
!   | TODO_remove_unused_locals);
  }
!   else
! return 0;
  }
  
  static bool
--- 898,907 
/* As we introduced new control-flow we need to insert PHI-nodes
   for the call-clobbers of the remaining call.  */
mark_virtual_operands_for_renaming (cfun);
!   return TODO_update_ssa;
  }
! 
!   return 0;
  }
  
  static bool
Index: gcc/tree-ssa-forwprop.c
===
*** gcc/tree-ssa-forwprop.c (revision 196167)
--- gcc/tree-ssa-forwprop.c (working copy)
*** ssa_forward_propagate_and_combine (void)
*** 2936,2942 
  && forward_propagate_addr_expr (lhs, rhs))
{
  release_defs (stmt);
- todoflags |= TODO_remove_unused_locals;
  gsi_remove (&gsi, true);
}
  else
--- 2936,2941 
*** ssa_forward_propagate_and_combine (void)
*** 2961,2967 
   off)
{
  release_defs (stmt);
- todoflags |= TODO_remove_unused_locals;
  gsi_remove (&gsi, true);
}
  else if (is_gimple_min_invariant (rhs))
--- 2960,2965 
Index: gcc/tree-ssa-dce.c
===
*** gcc/tree-ssa-dce.c  (revision 196167)
--- gcc/tree-ssa-dce.c  (working copy)
*** perform_tree_ssa_dce (bool aggressive)
*** 1607,1616 
free_edge_list (el);
  
if (something_changed)
! return (TODO_update_ssa | TODO_cleanup_cfg | TODO_ggc_collect
!   | TODO_remove_unused_locals);
!   else
! return 0;
  }
  
  /* Pass entry points.  */
--- 1607,1614 
free_edge_list (el);
  
if (something_changed)
! return TODO_update_ssa | TODO_cleanup_cfg;
!   return 0;
  }
  
  /* Pass entry points.  */
Index: gcc/tree-ssa-copyrename.c
===
*** gcc/tree-ssa-copyrename.c   (revision 196167)
--- gcc/tree-ssa-copyrename.c   (working copy)
*** static struct
*** 113,119 
  /* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
 Choose a representative for the partition, and send debug info to DEBUG.  
*/
  
! static bool
  copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE 
*debug)
  {
int p1, p2, p3;
--- 113,119 
  /* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid.
 Choose a representative for the partition, and send debug info to DEBUG.  
*/
  
! static void 
  copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE 
*debug)

Re: RFC: [PATCH,ARM] Fix 56110

2013-02-20 Thread Richard Earnshaw

On 19/02/13 22:26, Tilman Sauerbeck wrote:

I don't get why relaxing the restrictions for the
andsi3_compare0_scratch pattern results in a mismatch for the
zeroextractsi_compare0_scratch one.

Any ideas?


Because of the way combine works.  It first tries to find a pattern that 
doesn't have a clobber expression.  It finds your new pattern and then 
uses that.  But since that can't handle immediates, reload then comes 
along and forces the constant into a register.


You need one pattern to deal with all the cases.

R.



Re: [PATCH] Remove broken powf hack

2013-02-20 Thread Tobias Burnus

Am 18.02.2013 18:49, schrieb John David Anglin:

This patch removes the broken powf hack.  This problem is now fixed in
the PA backend.
Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.
OK for trunk?


OK. Thanks for the patch and for fixing the problem more properly.*

Tobias

*http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00852.html


Fix for 56175

2013-02-20 Thread Yuri Rumyantsev
Hi All,

This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
pattern in simpify_bitwise_binary for short integer types.
The fix is very simple - we simply turn off short type sinking at the
first pass of forward propagation allows to get
+10% speedup for important benchmark Coremark 1.0 at x86 Atom and
+5-7% for other x86 platforms too.
Bootstrapping and regression testing were successful on x86-64.

Is it Ok for trunk?

ChangeLog.

2013-02-20  Yuri Rumyantsev  

PR tree-optimization/56175
* tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
for short integer types.


patch
Description: Binary data


Re: [PATCH][RFC] Less TODO_remove_unused_locals

2013-02-20 Thread Richard Biener
On Wed, 20 Feb 2013, Richard Biener wrote:

> 
> Hunting for the "we're getting slower" bits I noticed that
> TODO_remove_unused_locals is a big part of execute_function_todo
> (and accounts for 1% of compile-time of ac.f90).
> The following patch removes most of the remove_unused_locals
> calls based on the fact that with anonymous SSA names now available
> we should never create new locals (wishful thinking of course ...)
> and the important places to remove unused stuff are driven by
> 1) avoid creating yet another copy of the unused stuff, thus do
> it before inlining, on the callee;  2) avoid pinning unused memory
> while we operate on other function bodies, thus, do it at the end
> of non-IPA pass pipelines
> 
> In the end this asks for more explicit placement and thus a
> real pass ... but the following should be enough as a RFC and
> be good enough for 4.8.
> 
> We keep doing remove_unused_locals after going into SSA
> and now after releasing unused SSA names (that should suffice
> to achieve 1) for early and IPA inlining and for 2) pre-IPA.
> post-IPA we perform it after IPA inline transform and at right
> before expanding to RTL (avoid expanding unused stack vars).
> 
> I'm scheduling a bootstrap & regtest.
> 
> Ok for trunk?

As requested, here is some statistics.  stage2 gcc/ compiled with
-fdump-statistics-stats, summed over all files with

Index: gcc/tree-ssa-live.c
===
--- gcc/tree-ssa-live.c (revision 196170)
+++ gcc/tree-ssa-live.c (working copy)
@@ -889,7 +889,10 @@ remove_unused_locals (void)
   dstidx++;
 }
   if (dstidx != num)
-cfun->local_decls->truncate (dstidx);
+{
+  statistics_counter_event (cfun, "unused VAR_DECLs removed", num - 
dstidx);
+  cfun->local_decls->truncate (dstidx);
+}
 
   remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
   clear_unused_block_pointer ();

gives us before the patch (first number is the static pass number,
last is the pass name, the number of removed vars is second):

20 323409 ssa
23 131864 einline
25 29311 copyrename
26 2178 ccp
27 9771 forwprop
30 29 fre
31 60212 copyprop
33 13849 cddce
35 10 tailr
39 355 local-pure-const
40 4347 fnsplit
-
subtotal 575335

52 46056 inline
59 10834 copyrename
61 3721 ccp
62 83 forwprop
66 826 fre
67 4821 copyprop
69 5017 vrp
70 24764 dce
73 2 ifcombine
74 662 phiopt
76 137 ch
80 2909 copyrename
81 212 dom
82 1472 phicprop
85 1322 dce
86 51 forwprop
87 4 phiopt
90 48 ccp
91 19 copyprop
95 6 pre
101 5 lim
102 883 copyprop
103 3082 dceloop
125 2 loopdone
129 589 vrp
131 49 dom
132 145 phicprop
133 316 cddce
137 5 forwprop
138 1 phiopt
142 458 copyrename
168 2955 optimized

total 686791

and after the patch, adjusted as below (eh, I failed to notice
we schedule remove_unused_locals if TODO_cleanup_cfg did sth...):

20 323396 ssa
23 107755 einline
41 135260 release_ssa
--
subtotal 566411

52 47513 inline
168 61713 optimized

total 675637

as expected the most unused locals are removed before final inlining.
I suppose the difference in (sub-)total is because patched we copy
less unused locals due to the extra remove-unused locals at
release_ssa time.  I can't explain the 13 difference for into-SSA.
I suppose I should also count the number of remaining locals ...

Scheduled for re-bootstrap / regtest.

Thanks,
Richard.

2013-02-20  Richard Biener  

* tree-call-cdce.c (tree_call_cdce): Do not remove unused locals.
* tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Likewise.
* tree-ssa-dce.c (perform_tree_ssa_dce): Likewise.
* tree-ssa-copyrename.c (copy_rename_partition_coalesce): Do
not return anything.
(rename_ssa_copies): Do not remove unused locals.
* tree-ssa-ccp.c (do_ssa_ccp): Likewise.
* tree-ssanames.c (pass_release_ssa_names): Remove unused
locals first.
* passes.c (execute_function_todo): Do not schedule unused locals
removal if cleanup_tree_cfg did something.
* tree-ssa-live.c (remove_unused_locals): Dump statistics
about the number of removed locals.

Index: gcc/tree-call-cdce.c
===
*** gcc/tree-call-cdce.c.orig   2013-02-20 12:49:24.0 +0100
--- gcc/tree-call-cdce.c2013-02-20 12:51:00.528421709 +0100
*** tree_call_cdce (void)
*** 898,908 
/* As we introduced new control-flow we need to insert PHI-nodes
   for the call-clobbers of the remaining call.  */
mark_virtual_operands_for_renaming (cfun);
!   return (TODO_update_ssa | TODO_cleanup_cfg | TODO_ggc_collect
!   | TODO_remove_unused_locals);
  }
!   else
! return 0;
  }
  
  static bool
--- 898,907 
/* As we introduced new control-flow we need to insert PHI-nodes
   for the call-clobbers of the remaining call.  */
mark_virtual_operands_for_renam

Re: Fix for 56175

2013-02-20 Thread Richard Biener
On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
> pattern in simpify_bitwise_binary for short integer types.
> The fix is very simple - we simply turn off short type sinking at the
> first pass of forward propagation allows to get
> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
> +5-7% for other x86 platforms too.
> Bootstrapping and regression testing were successful on x86-64.
>
> Is it Ok for trunk?

It definitely needs a comment before the checks.

Also I think it simply shows that the code is placed at the wrong spot.
Simply moving it down in simplify_bitwise_binary to be done the very last
should get both of the effects done.

Can you rework the patch according to that?

You also miss a testcase, we should make sure to not regress again here.

Thanks,
Richard.

> ChangeLog.
>
> 2013-02-20  Yuri Rumyantsev  
>
> PR tree-optimization/56175
> * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
> for short integer types.


[PATCH] Fix PR56398

2013-02-20 Thread Richard Biener

This fixes an ICE because gimple_bb of a default def stmt is NULL.
Just don't do anything here, we're not going to adjust anything
anyway for them.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-02-20  Richard Biener  

PR tree-optimization/56398
* tree-vect-loop-manip.c (adjust_debug_stmts): Skip
SSA default defs.

Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  (revision 196169)
+++ gcc/tree-vect-loop-manip.c  (working copy)
@@ -187,6 +187,7 @@ adjust_debug_stmts (tree from, tree to,
 
   if (MAY_HAVE_DEBUG_STMTS
   && TREE_CODE (from) == SSA_NAME
+  && ! SSA_NAME_IS_DEFAULT_DEF (from)
   && ! virtual_operand_p (from))
 {
   ai.from = from;


[PATCH] Another simple dumping fix in IPA-CP

2013-02-20 Thread Martin Jambor
Hi,

when debugging a PR I noticed that dumped numbers do not correspond to
changed PARAM_IPA_CP_EVAL_THRESHOLD and defaults are hard-wired to the
fprintf.  Fixed by the patch below, which bootstraps and tests fine on
x86_64-linux.  Unless there are objections, I will commit it tomorrow
as obvious.

Thanks,

Martin


2013-02-19  Martin Jambor  

* ipa-cp.c (good_cloning_opportunity_p): Dump the real threshold
instead of hard-wired defaults.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1654,7 +1654,7 @@ good_cloning_opportunity_p (struct cgrap
 ") -> evaluation: " HOST_WIDEST_INT_PRINT_DEC
 ", threshold: %i\n",
 time_benefit, size_cost, (HOST_WIDE_INT) count_sum,
-evaluation, 500);
+evaluation, PARAM_VALUE (PARAM_IPA_CP_EVAL_THRESHOLD));
 
   return evaluation >= PARAM_VALUE (PARAM_IPA_CP_EVAL_THRESHOLD);
 }
@@ -1668,7 +1668,7 @@ good_cloning_opportunity_p (struct cgrap
 "size: %i, freq_sum: %i) -> evaluation: "
 HOST_WIDEST_INT_PRINT_DEC ", threshold: %i\n",
 time_benefit, size_cost, freq_sum, evaluation,
-CGRAPH_FREQ_BASE /2);
+PARAM_VALUE (PARAM_IPA_CP_EVAL_THRESHOLD));
 
   return evaluation >= PARAM_VALUE (PARAM_IPA_CP_EVAL_THRESHOLD);
 }


[PATCH, PR 56294] Fix omissions in intersect_aggregates_with_edge

2013-02-20 Thread Martin Jambor
Hi,

this patch fixes an omission in IPA-CP's agg_replacements_to_vector
which needs to filter the vector by index and offset and a typo in
intersect_aggregates_with_edge which in one call passed the wrong
index to agg_replacements_to_vector.  This combined lead to empty
intersections which were caught by an assert checking exactly that.

Bootstrapped and tested on x86_64-linux (all languages + Ada) with
default BOOT_CFLAGS and also with BOOT_CFLAGS='-O2 -g -fipa-cp-clone
--param=ipa-cp-eval-threshold=100' (C, C++ and Fortran only), I'm
currently bootstrapping with the param set to 1.

OK for trunk?

Thanks,

Martin


2013-02-19  Martin Jambor  

PR tree-optimization/56310
* ipa-cp.c (agg_replacements_to_vector): New parameter index, copy
only matching indices and non-negative final offsets.
(intersect_aggregates_with_edge): Pass src_idx to
agg_replacements_to_vector.  Pass src_idx insstead of index to
intersect_with_agg_replacements.

testsuite/
* g++.dg/ipa/pr56310.C: New test.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -2807,12 +2807,15 @@ intersect_with_plats (struct ipcp_param_
vector result while subtracting OFFSET from the individual value offsets.  
*/
 
 static vec
-agg_replacements_to_vector (struct cgraph_node *node, HOST_WIDE_INT offset)
+agg_replacements_to_vector (struct cgraph_node *node, int index,
+   HOST_WIDE_INT offset)
 {
   struct ipa_agg_replacement_value *av;
   vec res = vNULL;
 
   for (av = ipa_get_agg_replacements_for_node (node); av; av = av->next)
+if (av->index == index
+   && (av->offset - offset) >= 0)
 {
   struct ipa_agg_jf_item item;
   gcc_checking_assert (av->value);
@@ -2892,7 +2895,7 @@ intersect_aggregates_with_edge (struct c
  if (agg_pass_through_permissible_p (orig_plats, jfunc))
{
  if (!inter.exists ())
-   inter = agg_replacements_to_vector (cs->caller, 0);
+   inter = agg_replacements_to_vector (cs->caller, src_idx, 0);
  else
intersect_with_agg_replacements (cs->caller, src_idx,
 &inter, 0);
@@ -2925,9 +2928,9 @@ intersect_aggregates_with_edge (struct c
   if (caller_info->ipcp_orig_node)
{
  if (!inter.exists ())
-   inter = agg_replacements_to_vector (cs->caller, delta);
+   inter = agg_replacements_to_vector (cs->caller, src_idx, delta);
  else
-   intersect_with_agg_replacements (cs->caller, index, &inter,
+   intersect_with_agg_replacements (cs->caller, src_idx, &inter,
 delta);
}
   else
Index: src/gcc/testsuite/g++.dg/ipa/pr56310.C
===
--- /dev/null
+++ src/gcc/testsuite/g++.dg/ipa/pr56310.C
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fipa-cp -std=gnu++0x -fno-early-inlining -fipa-cp-clone 
--param=ipa-cp-eval-threshold=1" } */
+
+void bar (void *, void *);
+
+struct C
+{
+  constexpr C ():p (0)
+  {
+  }
+  void *get ()
+  {
+return p;
+  }
+  void *p;
+};
+
+struct B:C
+{
+};
+
+struct A
+{
+  void f (B * x, B * y)
+  {
+bar (x->get (), y->get ());
+  }
+};
+
+void
+foo ()
+{
+  A a;
+  B b;
+  a.f (&b, &b);
+}


[PR middle-end/56108] handle transactions with ASMs in the first block

2013-02-20 Thread Aldy Hernandez
In the following test, the first statement of a relaxed transaction is 
an inline asm:


  __transaction_relaxed { __asm__(""); }

Since we bypass inserting BUILT_IN_TM_IRREVOCABLE at the beginning of 
transactions that are sure to be irrevocable, later when we try to 
expand the transaction, we ICE when we encounter the inline asm.


Currently, we bypass the TM_IRREVOCABLE call here:

 for (region = d->all_tm_regions; region; region = region->next)
{
  /* If we're sure to go irrevocable, don't transform anything.  */
  if (d->irrevocable_blocks_normal
  && bitmap_bit_p (d->irrevocable_blocks_normal,
   region->entry_block->index))
{
  transaction_subcode_ior (region, GTMA_DOES_GO_IRREVOCABLE);
  transaction_subcode_ior (region, GTMA_MAY_ENTER_IRREVOCABLE);
  continue;
}

If I understand this correctly, ideally a transaction marked as 
doesGoIrrevocable shouldn't bother instrumenting the statements inside, 
since the runtime will go irrevocable immediately.  In which case, we 
can elide the instrumentation altogether as the attached patch does.


If my analysis is correct, then testsuite/gcc.dg/tm/memopt-1.c would 
surely go irrevocable, thus requiring no instrumentation, causing the 
memory optimizations to get skipped altogether.  In which case, it's 
best to mark the function calls as safe, so they don't cause the 
transaction to become obviously irrevocable.


Is this correct?  If so, OK?
PR middle-end/56108
* trans-mem.c (execute_tm_mark): Do not expand transactions that
are sure to go irrevocable.
testsuite/
* gcc.dg/tm/memopt-1.c: Declare functions transaction_safe.

diff --git a/gcc/testsuite/gcc.dg/tm/memopt-1.c 
b/gcc/testsuite/gcc.dg/tm/memopt-1.c
index b78a6d4..c5ac5ce 100644
--- a/gcc/testsuite/gcc.dg/tm/memopt-1.c
+++ b/gcc/testsuite/gcc.dg/tm/memopt-1.c
@@ -2,8 +2,8 @@
 /* { dg-options "-fgnu-tm -O -fdump-tree-tmmemopt" } */
 
 long g, xxx, yyy;
-extern george() __attribute__((transaction_callable));
-extern ringo(long int);
+extern george() __attribute__((transaction_safe));
+extern ringo(long int) __attribute__((transaction_safe));
 int i;
 
 f()
diff --git a/gcc/testsuite/gcc.dg/tm/pr56108.c 
b/gcc/testsuite/gcc.dg/tm/pr56108.c
new file mode 100644
index 000..81ff574
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tm/pr56108.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-fgnu-tm -O" } */
+
+int
+main()
+{
+  __transaction_relaxed { __asm__(""); }
+  return 0;
+}
diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index dd3918e..71eaa44 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -2859,8 +2859,23 @@ execute_tm_mark (void)
   // Expand memory operations into calls into the runtime.
   // This collects log entries as well.
   FOR_EACH_VEC_ELT (bb_regions, i, r)
-if (r != NULL)
-  expand_block_tm (r, BASIC_BLOCK (i));
+{
+  if (r != NULL)
+   {
+ if (r->transaction_stmt)
+   {
+ unsigned sub = gimple_transaction_subcode (r->transaction_stmt);
+
+ /* If we're sure to go irrevocable, there won't be
+anything to expand, since the run-time will go
+irrevocable right away.  */
+ if (sub & GTMA_DOES_GO_IRREVOCABLE
+ && sub & GTMA_MAY_ENTER_IRREVOCABLE)
+   continue;
+   }
+ expand_block_tm (r, BASIC_BLOCK (i));
+   }
+}
 
   bb_regions.release ();
 


Re: Fix for 56175

2013-02-20 Thread Yuri Rumyantsev
Richard,

First of all, your proposal to move type sinking to the end of
function does not work since we handle each statement in function and
we want that 1st type folding of X & C will not happen.
Note that we have the following sequence of gimple before forwprop1:

   x.0_10 = (signed char) x_8;
  _11 = x.0_10 & 1;
  _12 = (signed char) y_9;
  _13 = _12 & 1;
  _14 = _11 ^ _13;

I also added comment to my fix and create new test for it. I also
checked that this test is passed with patched compiler  only. So
Change Log was also modified:

ChangeLog

2013-02-20  Yuri Rumyantsev  

PR tree-optimization/56175
* tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type sinking
at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
for short integer types.
* gcc.dg/pr56175.c: New test.




2013/2/20 Richard Biener :
> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>> pattern in simpify_bitwise_binary for short integer types.
>> The fix is very simple - we simply turn off short type sinking at the
>> first pass of forward propagation allows to get
>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>> +5-7% for other x86 platforms too.
>> Bootstrapping and regression testing were successful on x86-64.
>>
>> Is it Ok for trunk?
>
> It definitely needs a comment before the checks.
>
> Also I think it simply shows that the code is placed at the wrong spot.
> Simply moving it down in simplify_bitwise_binary to be done the very last
> should get both of the effects done.
>
> Can you rework the patch according to that?
>
> You also miss a testcase, we should make sure to not regress again here.
>
> Thanks,
> Richard.
>
>> ChangeLog.
>>
>> 2013-02-20  Yuri Rumyantsev  
>>
>> PR tree-optimization/56175
>> * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
>> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>> for short integer types.


56175.diff
Description: Binary data


Fix ICE in ipa_make_edge_direct_to_target

2013-02-20 Thread Jan Hubicka
Hi,
in the testcase bellow we get an ICE in ipa_make_edge_direct_to_target.  There
is virtual call that gets devirtualized only while inlining functions called
once.  At this point however we already removed bodies for virtual functions
from the callgraph, so we need to update it and re-create its node.

This is in fact just a special case where we need to invent new cgraph node for
devitualization target.  This patch should handle also the case when we invent
direct call to a function that is not otherwise used in the current TU.

Bootstrapped/regtested x86_64-linux, comitted.

2013-02-20  Jan Hubicka  

PR tree-optimization/56265
* ipa-prop.c (ipa_make_edge_direct_to_target): Fixup callgraph when 
target is
referenced for firs ttime.

PR tree-optimization/56265
* testsuite/g++.dg/ipa/devirt-11.C: New testcase.

Index: ipa-prop.c
===
--- ipa-prop.c  (revision 196176)
+++ ipa-prop.c  (working copy)
@@ -2100,10 +2100,65 @@ ipa_make_edge_direct_to_target (struct c
   if (TREE_CODE (target) == ADDR_EXPR)
 target = TREE_OPERAND (target, 0);
   if (TREE_CODE (target) != FUNCTION_DECL)
-return NULL;
+{
+  target = canonicalize_constructor_val (target, NULL);
+  if (!target || TREE_CODE (target) != FUNCTION_DECL)
+   {
+ if (dump_file)
+   fprintf (dump_file, "ipa-prop: Discovered direct call to 
non-function"
+   " in (%s/%i).\n",
+cgraph_node_name (ie->caller), ie->caller->uid);
+ return NULL;
+   }
+}
   callee = cgraph_get_node (target);
-  if (!callee)
-return NULL;
+
+  /* Because may-edges are not explicitely represented and vtable may be 
external,
+ we may create the first reference to the object in the unit.  */
+  if (!callee || callee->global.inlined_to)
+{
+  struct cgraph_node *first_clone = callee;
+
+  /* We are better to ensure we can refer to it.
+In the case of static functions we are out of luck, since we already   
+removed its body.  In the case of public functions we may or may
+not introduce the reference.  */
+  if (!canonicalize_constructor_val (target, NULL)
+ || !TREE_PUBLIC (target))
+   {
+ if (dump_file)
+   fprintf (dump_file, "ipa-prop: Discovered call to a known target "
+"(%s/%i -> %s/%i) but can not refer to it. Giving up.\n",
+xstrdup (cgraph_node_name (ie->caller)), ie->caller->uid,
+xstrdup (cgraph_node_name (ie->callee)), ie->callee->uid);
+ return NULL;
+   }
+
+  /* Create symbol table node.  Even if inline clone exists, we can not 
take
+it as a target of non-inlined call.  */
+  callee = cgraph_create_node (target);
+
+  /* OK, we previously inlined the function, then removed the offline copy 
and
+now we want it back for external call.  This can happen when 
devirtualizing
+while inlining function called once that happens after extern inlined 
and
+virtuals are already removed.  In this case introduce the external node
+and make it available for call.  */
+  if (first_clone)
+   {
+ first_clone->clone_of = callee;
+ callee->clones = first_clone;
+ symtab_prevail_in_asm_name_hash ((symtab_node)callee);
+ symtab_insert_node_to_hashtable ((symtab_node)callee);
+ if (dump_file)
+   fprintf (dump_file, "ipa-prop: Introduced new external node "
+"(%s/%i) and turned into root of the clone tree.\n",
+xstrdup (cgraph_node_name (callee)), callee->uid);
+   }
+  else if (dump_file)
+   fprintf (dump_file, "ipa-prop: Introduced new external node "
+"(%s/%i).\n",
+xstrdup (cgraph_node_name (callee)), callee->uid);
+}
   ipa_check_create_node_params ();
 
   /* We can not make edges to inline clones.  It is bug that someone removed
Index: testsuite/g++.dg/ipa/devirt-11.C
===
--- testsuite/g++.dg/ipa/devirt-11.C(revision 0)
+++ testsuite/g++.dg/ipa/devirt-11.C(revision 0)
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-inline" } */
+int baz ();
+struct A
+{
+  virtual int fn2 () = 0;
+  virtual int *fn3 ();
+  double *fn4 ();
+  int fn5 (int);
+  template 
+  void fn1 (A &, T) { fn3 (); fn4 (); fn2 (); }
+};
+struct B : A
+{
+  int fn2 () { return 6; }
+  void fn3 (int, double);
+  B (bool = true);
+  B (int, int);
+};
+template 
+void
+foo (B &x, A &y, A &z)
+{
+  y.fn2 ();
+  z.fn2 ();
+  int i = baz ();
+  int j = (y.fn3 ())[i];
+  x.fn3 (j, (y.fn4 ())[i] + (z.fn4 ())[z.fn5 (j)]);
+}
+inline B
+operator+ (A &y, A &z)
+{
+  B x;
+  foo (x, y, z);
+  return x;
+}
+void
+bar ()
+{
+  B a, b, c (4, 0), d;
+  a.fn1 (b, .6);
+  baz ();
+  c + d;
+}

libgo patch committed: Fix x86_64 Solaris pointer usage

2013-02-20 Thread Ian Lance Taylor
This patch from Janne Snabb should fix PR 56320 about random failures on
Solaris x86_64.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 29b742a2ed37 -r 3b1c3cceaf02 libgo/runtime/lfstack.c
--- a/libgo/runtime/lfstack.c	Fri Feb 15 10:54:51 2013 -0800
+++ b/libgo/runtime/lfstack.c	Wed Feb 20 11:41:05 2013 -0800
@@ -17,9 +17,10 @@
 #define PTR_MASK ((1ull

libgo patch committed: Solaris net fixes

2013-02-20 Thread Ian Lance Taylor
This patch, mainly from Rainer Orth, in PR 56171 fixes passing a file
descriptor on Solaris.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 3b1c3cceaf02 libgo/Makefile.am
--- a/libgo/Makefile.am	Wed Feb 20 11:41:05 2013 -0800
+++ b/libgo/Makefile.am	Wed Feb 20 12:01:58 2013 -0800
@@ -1656,6 +1656,13 @@
 endif
 endif
 
+# Define socket functions.
+if LIBGO_IS_SOLARIS
+syscall_socket_os_file = go/syscall/socket_xnet.go
+else
+syscall_socket_os_file = go/syscall/socket_posix.go
+endif
+
 # Support for uname.
 if LIBGO_IS_SOLARIS
 if LIBGO_IS_386
@@ -1722,6 +1729,7 @@
 	$(syscall_errstr_file) \
 	$(syscall_size_file) \
 	$(syscall_socket_file) \
+	$(syscall_socket_os_file) \
 	$(syscall_uname_file) \
 	$(syscall_netlink_file) \
 	$(syscall_lsf_file) \
@@ -1746,13 +1754,20 @@
 	go/syscall/passfd_test.go
 
 libcalls.go: s-libcalls; @true
-s-libcalls: Makefile go/syscall/mksyscall.awk $(go_base_syscall_files)
+s-libcalls: libcalls-list go/syscall/mksyscall.awk $(go_base_syscall_files)
 	rm -f libcalls.go.tmp
-	files=`echo $^ | sed -e 's/Makefile//' -e 's|[^ ]*go/syscall/mksyscall.awk||'`; \
+	files=`echo $^ | sed -e 's/libcalls-list//' -e 's|[^ ]*go/syscall/mksyscall.awk||'`; \
 	$(AWK) -f $(srcdir)/go/syscall/mksyscall.awk $${files} > libcalls.go.tmp
 	$(SHELL) $(srcdir)/../move-if-change libcalls.go.tmp libcalls.go
 	$(STAMP) $@
 
+libcalls-list: s-libcalls-list; @true
+s-libcalls-list: Makefile
+	rm -f libcalls-list.tmp
+	echo $(go_base_syscall_files) > libcalls-list.tmp
+	$(SHELL) $(srcdir)/../move-if-change libcalls-list.tmp libcalls-list
+	$(STAMP) $@
+
 syscall_arch.go: s-syscall_arch; @true
 s-syscall_arch: Makefile
 	rm -f syscall_arch.go.tmp
diff -r 3b1c3cceaf02 libgo/go/syscall/sockcmsg_unix.go
--- a/libgo/go/syscall/sockcmsg_unix.go	Wed Feb 20 11:41:05 2013 -0800
+++ b/libgo/go/syscall/sockcmsg_unix.go	Wed Feb 20 12:01:58 2013 -0800
@@ -8,7 +8,10 @@
 
 package syscall
 
-import "unsafe"
+import (
+	"runtime"
+	"unsafe"
+)
 
 // Round the length of a raw sockaddr up to align it propery.
 func cmsgAlignOf(salen int) int {
@@ -18,6 +21,11 @@
 	if darwinAMD64 {
 		salign = 4
 	}
+	// NOTE: Solaris always uses 32-bit alignment,
+	// cf. _CMSG_DATA_ALIGNMENT in .
+	if runtime.GOOS == "solaris" {
+		salign = 4
+	}
 	if salen == 0 {
 		return salign
 	}
diff -r 3b1c3cceaf02 libgo/go/syscall/socket.go
--- a/libgo/go/syscall/socket.go	Wed Feb 20 11:41:05 2013 -0800
+++ b/libgo/go/syscall/socket.go	Wed Feb 20 12:01:58 2013 -0800
@@ -177,9 +177,6 @@
 	return anyToSockaddr(&rsa)
 }
 
-//sys	bind(fd int, sa *RawSockaddrAny, len Socklen_t) (err error)
-//bind(fd _C_int, sa *RawSockaddrAny, len Socklen_t) _C_int
-
 func Bind(fd int, sa Sockaddr) (err error) {
 	ptr, n, err := sa.sockaddr()
 	if err != nil {
@@ -188,9 +185,6 @@
 	return bind(fd, ptr, n)
 }
 
-//sys	connect(s int, addr *RawSockaddrAny, addrlen Socklen_t) (err error)
-//connect(s _C_int, addr *RawSockaddrAny, addrlen Socklen_t) _C_int
-
 func Connect(fd int, sa Sockaddr) (err error) {
 	ptr, n, err := sa.sockaddr()
 	if err != nil {
@@ -199,9 +193,6 @@
 	return connect(fd, ptr, n)
 }
 
-//sysnb	socket(domain int, typ int, proto int) (fd int, err error)
-//socket(domain _C_int, typ _C_int, protocol _C_int) _C_int
-
 func Socket(domain, typ, proto int) (fd int, err error) {
 	if domain == AF_INET6 && SocketDisableIPv6 {
 		return -1, EAFNOSUPPORT
@@ -210,9 +201,6 @@
 	return
 }
 
-//sysnb	socketpair(domain int, typ int, proto int, fd *[2]_C_int) (err error)
-//socketpair(domain _C_int, typ _C_int, protocol _C_int, fd *[2]_C_int) _C_int
-
 func Socketpair(domain, typ, proto int) (fd [2]int, err error) {
 	var fdx [2]_C_int
 	err = socketpair(domain, typ, proto, &fdx)
@@ -223,9 +211,6 @@
 	return
 }
 
-//sys	getsockopt(s int, level int, name int, val uintptr, vallen *Socklen_t) (err error)
-//getsockopt(s _C_int, level _C_int, name _C_int, val *byte, vallen *Socklen_t) _C_int
-
 func GetsockoptByte(fd, level, opt int) (value byte, err error) {
 	var n byte
 	vallen := Socklen_t(1)
@@ -326,9 +311,6 @@
 	return
 }
 
-//sys	sendto(s int, buf []byte, flags int, to *RawSockaddrAny, tolen Socklen_t) (err error)
-//sendto(s _C_int, buf *byte, len Size_t, flags _C_int, to *RawSockaddrAny, tolen Socklen_t) Ssize_t
-
 func Sendto(fd int, p []byte, flags int, to Sockaddr) (err error) {
 	ptr, n, err := to.sockaddr()
 	if err != nil {
@@ -337,9 +319,6 @@
 	return sendto(fd, p, flags, ptr, n)
 }
 
-//sys	recvmsg(s int, msg *Msghdr, flags int) (n int, err error)
-//recvmsg(s _C_int, msg *Msghdr, flags _C_int) Ssize_t
-
 func Recvmsg(fd int, p, oob []byte, flags int) (n, oobn int, recvflags int, from Sockaddr, err error) {
 	var msg Msghdr
 	var rsa RawSockaddrAny
@@ -374,9 +353,6 @@
 	return
 }
 
-//sys	sendmsg(s int, msg *Msghdr, flags int) (err error)
-//sendmsg(s _C_int, msg *Msghdr, flags _C_int) Ssize_t
-
 func Sendmsg(fd int, p, oob []byte, to Sockaddr, flags int) (err error) {
 	var ptr *RawSockaddrAny
 	var s

Re: [PATCH, PR 56294] Fix omissions in intersect_aggregates_with_edge

2013-02-20 Thread Jan Hubicka
> Hi,
> 
> this patch fixes an omission in IPA-CP's agg_replacements_to_vector
> which needs to filter the vector by index and offset and a typo in
> intersect_aggregates_with_edge which in one call passed the wrong
> index to agg_replacements_to_vector.  This combined lead to empty
> intersections which were caught by an assert checking exactly that.
> 
> Bootstrapped and tested on x86_64-linux (all languages + Ada) with
> default BOOT_CFLAGS and also with BOOT_CFLAGS='-O2 -g -fipa-cp-clone
> --param=ipa-cp-eval-threshold=100' (C, C++ and Fortran only), I'm
> currently bootstrapping with the param set to 1.
> 
> OK for trunk?
OK, thanks!
Honza


[patch] Tweak two libstdc++ tests

2013-02-20 Thread Jonathan Wakely
* testsuite/23_containers/unordered_set/55043.cc: Add missing
namespace qualification.
* testsuite/23_containers/unordered_multiset/55043.cc: Likewise.

Tested x86_64-linux, committed to trunk.
commit a6d9aa71b453ad2cf7b1cbd581fa04603728bad6
Author: Jonathan Wakely 
Date:   Wed Feb 20 21:23:17 2013 +

* testsuite/23_containers/unordered_set/55043.cc: Add missing
namespace qualification.
* testsuite/23_containers/unordered_multiset/55043.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/55043.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_multiset/55043.cc
index 445e4e4..9d71cff 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/55043.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/55043.cc
@@ -33,7 +33,7 @@ struct equal {
   bool operator()(const MoveOnly&, const MoveOnly) const { return true; }
 };
 struct hash {
-  size_t operator()(const MoveOnly&) const { return 0; }
+  std::size_t operator()(const MoveOnly&) const { return 0; }
 };
 
 template
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/55043.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_set/55043.cc
index e5ba065..1524890 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/55043.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/55043.cc
@@ -33,7 +33,7 @@ struct equal {
   bool operator()(const MoveOnly&, const MoveOnly) const { return true; }
 };
 struct hash {
-  size_t operator()(const MoveOnly&) const { return 0; }
+  std::size_t operator()(const MoveOnly&) const { return 0; }
 };
 
 template


[Patch, Fortran] PR 56385: [4.6/4.7/4.8 Regression] [OOP] ICE with allocatable function result in a procedure-pointer component

2013-02-20 Thread Janus Weil
Hi all,

here is a straightforward patch which fixes a regression with
procedure-pointer components which have an allocatable result.
Regtests cleanly on x86_64-unknown-linux-gnu. Ok for trunk/4.7/4.6?

[In absence of any reviews I will commit as obvious on the weekend.]

Cheers,
Janus



2013-02-20  Janus Weil  

PR fortran/56385
* trans-array.c (structure_alloc_comps): Handle procedure-pointer
components with allocatable result.

2013-02-20  Janus Weil  

PR fortran/56385
* gfortran.dg/proc_ptr_comp_37.f90: New.


pr56385.diff
Description: Binary data


proc_ptr_comp_37.f90
Description: Binary data


[patch] Fix spelling in libstdc++ docs and comments

2013-02-20 Thread Jonathan Wakely
* doc/html/faq.html: Fix spelling.
* doc/xml/faq.xml: Likewise.
* include/bits/basic_ios.h: Likewise.
* include/bits/regex.h: Likewise.
* include/std/istream: Likewise.
* include/std/streambuf: Likewise.

Tested x86_64-linux, committed to trunk
commit 739effe92edc43ef724f0ffef525e6ed23e62074
Author: Jonathan Wakely 
Date:   Wed Feb 20 22:11:06 2013 +

* doc/html/faq.html: Fix spelling.
* doc/xml/faq.xml: Likewise.
* include/bits/basic_ios.h: Likewise.
* include/bits/regex.h: Likewise.
* include/std/istream: Likewise.
* include/std/streambuf: Likewise.

diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index 0c1d328..91952071 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -503,7 +503,7 @@
 Short answer: Pretty much everything works
 except for some corner cases.  Support for localization
 in locale may be incomplete on non-GNU
-platforms. Also dependant on the underlying platform is support
+platforms. Also dependent on the underlying platform is support
 for wchar_t and long
 long specializations, and details of thread support.
 
diff --git a/libstdc++-v3/doc/xml/faq.xml b/libstdc++-v3/doc/xml/faq.xml
index 1408bd2..4e33392 100644
--- a/libstdc++-v3/doc/xml/faq.xml
+++ b/libstdc++-v3/doc/xml/faq.xml
@@ -685,7 +685,7 @@
 Short answer: Pretty much everything works
 except for some corner cases.  Support for localization
 in locale may be incomplete on non-GNU
-platforms. Also dependant on the underlying platform is support
+platforms. Also dependent on the underlying platform is support
 for wchar_t and long
 long specializations, and details of thread support.
 
diff --git a/libstdc++-v3/include/bits/basic_ios.h 
b/libstdc++-v3/include/bits/basic_ios.h
index b78b464..5325800 100644
--- a/libstdc++-v3/include/bits/basic_ios.h
+++ b/libstdc++-v3/include/bits/basic_ios.h
@@ -69,7 +69,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   //@{
   /**
*  These are standard types.  They permit a standardized way of
-   *  referring to names of (or names dependant on) the template
+   *  referring to names of (or names dependent on) the template
*  parameters, which are specific to the implementation.
   */
   typedef _CharT char_type;
diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index 39704be..101925a 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -135,7 +135,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
   /**
-   * @brief Gets a sort key for a character sequence, independant of case.
+   * @brief Gets a sort key for a character sequence, independent of case.
*
* @param __first beginning of the character sequence.
* @param __last  one-past-the-end of the character sequence.
@@ -185,7 +185,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* the returned mask identifies the classification regardless of
* the case of the characters to be matched (for example,
* [[:lower:]] is the same as [[:alpha:]]), otherwise a
-   * case-dependant classification is returned.  The value
+   * case-dependent classification is returned.  The value
* returned shall be independent of the case of the characters
* in the character sequence. If the name is not recognized then
* returns a value that compares equal to 0.
diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index ae1485f..861bca5 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -660,7 +660,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   bool _M_ok;
 
 public:
-  /// Easy access to dependant types.
+  /// Easy access to dependent types.
   typedef _Traits  traits_type;
   typedef basic_streambuf<_CharT, _Traits> 
__streambuf_type;
   typedef basic_istream<_CharT, _Traits>   __istream_type;
diff --git a/libstdc++-v3/include/std/streambuf 
b/libstdc++-v3/include/std/streambuf
index 0fb2f07..00b3dd1 100644
--- a/libstdc++-v3/include/std/streambuf
+++ b/libstdc++-v3/include/std/streambuf
@@ -123,7 +123,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   //@{
   /**
*  These are standard types.  They permit a standardized way of
-   *  referring to names of (or names dependant on) the template
+   *  referring to names of (or names dependent on) the template
*  parameters, which are specific to the implementation.
   */
   typedef _CharT   char_type;



[patch] Fix libstdc++ doxygen page for basic_streambuf

2013-02-20 Thread Jonathan Wakely
This removes an unclosed @{ group marker so that Doxygen doesn't put
half the members of basic_streambuf in the "Friends" section (see
http://stackoverflow.com/q/14988997/981959)

I also changed uses of __streambuf_type to basic_streambuf, which
makes the Doxygen page look better because it doesn't use the
non-standard name.

* include/std/streambuf (basic_streambuf): Use injected class name
instead of non-standard __streambuf_type typedef. Fix unclosed Doxygen
group.

Tested x86_64-linux, committed to trunk.  I'll also make a smaller
change to 4.6 and 4.7 to just fix the doxygen group markup.
commit 8180df2e135e7a8957069bc50cf240900c9eb1c9
Author: Jonathan Wakely 
Date:   Wed Feb 20 22:13:32 2013 +

* include/std/streambuf (basic_streambuf): Use injected class name
instead of non-standard __streambuf_type typedef. Fix unclosed Doxygen
group.

diff --git a/libstdc++-v3/include/std/streambuf 
b/libstdc++-v3/include/std/streambuf
index 00b3dd1..26a3871 100644
--- a/libstdc++-v3/include/std/streambuf
+++ b/libstdc++-v3/include/std/streambuf
@@ -145,7 +145,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend class ostreambuf_iterator;
 
   friend streamsize
-  __copy_streambufs_eof<>(__streambuf_type*, __streambuf_type*, bool&);
+  __copy_streambufs_eof<>(basic_streambuf*, basic_streambuf*, bool&);
 
   template
 friend typename __gnu_cxx::__enable_if<__is_char<_CharT2>::__value, 
@@ -174,20 +174,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
basic_string<_CharT2, _Traits2, _Alloc>&, _CharT2);
 
 protected:
-  //@{
-  /**
+  /*
*  This is based on _IO_FILE, just reordered to be more consistent,
*  and is intended to be the most minimal abstraction for an
*  internal buffer.
*  -  get == input == read
*  -  put == output == write
   */
-  char_type*   _M_in_beg; // Start of get area. 
-  char_type*   _M_in_cur; // Current read area. 
-  char_type*   _M_in_end; // End of get area. 
-  char_type*   _M_out_beg;// Start of put area. 
-  char_type*   _M_out_cur;// Current put area. 
-  char_type*   _M_out_end;// End of put area.
+  char_type*   _M_in_beg; ///< Start of get area.
+  char_type*   _M_in_cur; ///< Current read area.
+  char_type*   _M_in_end; ///< End of get area.
+  char_type*   _M_out_beg;///< Start of put area.
+  char_type*   _M_out_cur;///< Current put area.
+  char_type*   _M_out_end;///< End of put area.
 
   /// Current locale setting.
   locale   _M_buf_locale;  
@@ -236,7 +235,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  derived @c foo member functions, passing the arguments (if any)
*  and returning the result unchanged.
   */
-  __streambuf_type* 
+  basic_streambuf*
   pubsetbuf(char_type* __s, streamsize __n) 
   { return this->setbuf(__s, __n); }
 
@@ -800,15 +799,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 private:
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // Side effect of DR 50. 
-  basic_streambuf(const __streambuf_type& __sb)
+  basic_streambuf(const basic_streambuf& __sb)
   : _M_in_beg(__sb._M_in_beg), _M_in_cur(__sb._M_in_cur), 
   _M_in_end(__sb._M_in_end), _M_out_beg(__sb._M_out_beg), 
   _M_out_cur(__sb._M_out_cur), _M_out_end(__sb._M_out_cur),
   _M_buf_locale(__sb._M_buf_locale) 
   { }
 
-  __streambuf_type& 
-  operator=(const __streambuf_type&) { return *this; };
+  basic_streambuf&
+  operator=(const basic_streambuf&) { return *this; };
 };
 
   // Explicit specialization declarations, defined in src/streambuf.cc.


Re: PATCH: Correctly configure all big-endian ARM archs, not just arm*-*-linux-*.

2013-02-20 Thread Seth LaForge
On Fri, Feb 15, 2013 at 3:29 PM, Mike Stump  wrote:
> No.  Counter proposal, let's handle the cases that don't work.  So, you said 
> in your original email that armeb-unknown-eabi doesn't work.
>
> So, in the existing case statement for:
>
> arm*-*-eabi*)
>
> let's just add:
>
> case ${target} in
> armeb-*-eabi*)
>   tm_defines="${tm_defines} TARGET_BIG_ENDIAN_DEFAULT=1"
> esac
>
> Is this not exactly what you want?  Doesn't this solve exactly what you 
> stated was the problem?

OK, attached patch
0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-armeb-eabi-a.patch does
as you describe, and works for my particular use-case.

>> I could inline the test into all
>> of the ARM cases below, but I don't like that approach since it's what
>> caused this problem in the first place (somebody adding BE support to
>> one ARM arch without adding it to the others).
>
> And does it work on uclinux?  Does it work on rtems?  Does it work on every 
> arm that every existed and will exist?  If the answer is no, then it is less 
> ideal than putting this in the config for eabi*).

Well, the current config is certainly broken when giving a big-endian
spec for uclinux, rtems, and every other arm that ever existed or will
exist.  It's possible there are other issues with using a big-endian
processor for uclinux, rtems, etc, but adding
TARGET_BIG_ENDIAN_DEFAULT=1 certainly gets those targets closer to
working.

> If it always works, then moving the existing on up to the existing:
>
> case ${target} in
> i[34567]86-*-*)
>
> would be the right approach.  x86 uses this location already to set 
> tm_defines="${tm_defines} USE_IX86_FRAME_POINTER=1" for example.  The 
> documentation above that group can state that this is the location for things 
> that are cpu specific and os and vendor neutral.

I like that solution better.  Following your suggested list of
big-endian architectures, attached patch
0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-arm-big-endi.patch adds
TARGET_BIG_ENDIAN_DEFAULT=1 for $target matching "armeb-* | armbe-* |
armv[3-8]b-*".  I left out xscaleeb, since config.sub canonicalizes
xscaleeb-* to armeb-*.

Please, pick whichever patch pleases you most.  I prefer
0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-arm-big-endi.patch.

Changelog entry for
0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-arm-big-endi.patch:

gcc/
* config.gcc: Add TARGET_BIG_ENDIAN_DEFAULT=1 for all arm big-endian 
archs.

Changelog entry for
0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-armeb-eabi-a.patch:

gcc/
* config.gcc: Add TARGET_BIG_ENDIAN_DEFAULT=1 for all armeb-*-eabi* 
archs.

Seth


0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-arm-big-endi.patch
Description: Binary data


0001-Add-TARGET_BIG_ENDIAN_DEFAULT-1-for-all-armeb-eabi-a.patch
Description: Binary data


Re: PATCH: Correctly configure all big-endian ARM archs, not just arm*-*-linux-*.

2013-02-20 Thread Seth LaForge
On Sat, Feb 16, 2013 at 7:45 AM, Bernhard Reutner-Fischer
 wrote:
> Sounds like a DUP of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16350
> Is the missing hunk in by now (cannot look myself right now)?

The commit you describe appears to be in trunk but not in gcc-4_7-branch.

I don't think the problem described in that bug is not a duplicate of
my problem - for gcc-4.8-20130210 (which includes that change) I have
to add my TARGET_BIG_ENDIAN_DEFAULT=1 change to make building for my
processor work, and gcc-4.7.2 also works for my processor using my
patch.

Seth


Re: Patch for 4.7: Avoid subreg'ing VFP D registers in big-endian mode

2013-02-20 Thread Seth LaForge
On Tue, Feb 19, 2013 at 2:59 PM, Ramana Radhakrishnan
 wrote:
> This is not the correct form of a changelog entry.

Sorry - another attempt below.

> Ok to backport provided no regressions  when running the testsuite in
> big endian mode.

Sorry to say I'm not sure how to run the testsuite, given that I'm
cross-compiling for a wacky TI automotive microcontroller.  I'm out of
town for the next few days, but I can try to run it under qemu next
week.

> Watch out for alignment of the \'s though it might be your mailer
> munging things rather than anything else. Can you please make sure
> that the \'s are aligned to the same column please ? This is why I
> prefer to get git to generate out patches using git-format-patch and
> attach them as text files to all my mails otherwise it seems to mess
> things up like this.

They're aligned for me, and I successfully applied my patch by copying
it out of my mail, so I blame your mail client.  :)  In any case, git
patch attached.

Seth

Hopefully correct changelog entry:

2013-02-20 Seth LaForge 

Backport
2012-10-22  Julian Brown  
* config/arm/arm.h (CANNOT_CHANGE_MODE_CLASS): Avoid subreg'ing
VFP D registers in big-endian mode.


0001-Avoid-subreg-ing-VFP-D-registers-in-big-endian-mode.patch
Description: Binary data


[PATCH] MIPS: MIPS32r2 FP MADD instruction set support

2013-02-20 Thread Maciej W. Rozycki
Hi,

 This issue was originally raised here:

http://gcc.gnu.org/ml/gcc-patches/2012-12/msg00863.html

 We have a shortcoming in GCC in that we only allow the use half of the FP 
MADD instruction subset (MADD.fmt and MSUB.fmt) in the 64-bit/32-register 
mode (CP0.Status.FR == 1) on MIPS32r2 processors.  Furthermore we never 
enable the other half (NMADD.fmt and NMSUB.fmt) on those processors.  
However this whole instruction subset is always available on MIPS32r2 FPUs 
regardless of the mode selected, just as it always has been on FPUs of the 
64-bit ISA line from MIPS IV up.

 The paired-single format however is indeed only available in the 
64-bit/32-register mode as from the MIPS V ISA up.  We do explicitly allow 
it for some (or no) reason for NMADD.PS and NMSUB.PS on MIPS32r2 
processors in the 32-bit/16-register FPU mode (this is probably globally 
overridden elsewhere).

 I'm not sure where this GCC limitation came from, but there were typos in 
the formats listed for the MSUB.S, MSUB.D, NMADD.S, NMADD.D, NMSUB.S and 
NMSUB.D instructions up to and including rev. 2.50 of vol. II of the 
MIPS32r2 architecture documentation set (MIPS doc #MD00086).  This may or 
may not have contributed to this problem as these instructions were listed 
as available from "MIPS64" up rather than from "MIPS64, MIPS32 Release 2" 
up, so no mention of the FPU mode there.

 The change below lifts the relevant restrictions removing a lot of 
clutter that's not needed anymore now that the data mode does not have to 
be checked.

 Also, according to MIPS IV ISA documentation these operations are only 
fused (i.e. don't match original IEEE 754-1985 accuracy requirements) on 
the original MIPS IV R8000 CPU, and MIPS architecture specs don't mention 
any limitations of these instructions either, so I have updated the GCC 
manual to document that on non-R8000 CPUs (which are ones we really care 
about) they are numerically equivalent to computations made with 
corresponding individual operations.

 Finally, while at it, I found it interesting that we have separate 
conditions to cover MADD.fmt/MSUB.fmt (ISA_HAS_FP_MADD4_MSUB4) and 
NMADD.fmt/NMADD.fmt (ISA_HAS_NMADD4_NMSUB4) while all the four 
instructions need to be implemented as a whole group per data format 
supported and cannot be separated (the MIPS architecture specification 
explicitly forbids subsetting).  The difference between the two conditions 
is the former expands to ISA_HAS_FP4, that is enables the subsubset for 
any MIPS IV and up FPU while the latter has an extra "&& (!TARGET_MIPS5400 
|| TARGET_MAD)" qualifier.

 I went ahead and checked available NEC VR54xx documentation and here's 
what I came up with:

1. "VR5400 MIPS RISC Microprocessor Family" datasheet (NEC doc #13362) 
   says:

   "The VR5400 processor family complies with the MIPS IV instruction set 
   and IEEE-754 floating-point and IEEE-1149.1/1149.1a JTAG specification, 
   [...]"

2. "VR5432 MIPS RISC Microprocessor User's Manual, Volume 2" (NEC doc 
   #13751) lists all the individual MADD.fmt, MSUB.fmt, NMADD.fmt and
   NMSUB.fmt instructions in Chapter 18 "Floating-Point Unit Instruction 
   Set" with no restrictions as to their availability (the only other 
   member of the VR54xx family I know of is the VR5464 that is a 
   high-performance version of the VR5432 and is fully software 
   compatible).

 Further to that TARGET_MAD controls whether to "Use PMC-style 'mad' 
instructions" that are all CPU rather than FPU instructions.  The VR5432 
indeed supports extra integer multiply-accumulate instructions, as 
documented in #2 above; these are the MACC/MACCHI/MACCHIU/MACCU and 
MSAC/MSACHI/MSACHIU/MSACU instructions as roughly covered by our 
ISA_HAS_MACC, ISA_HAS_MSAC and ISA_HAS_MACCHI knobs (the latter is not 
implied for TARGET_MIPS5400, perhaps because the family does not support 
the doubleword variants).

 All in all it looks to me like a misplaced hunk.  It was introduced in 
rev. 56471 (you were named as one of the contributors on that commit, so 
you may be able to remember and/or correct me if I am wrong here anywhere) 
and it looks to me it should have been applied to the ISA_HAS_MADD_MSUB 
macro instead that's still just a few lines above ISA_HAS_NMADD4_NMSUB4 
(and was even closer to ISA_HAS_NMADD_NMSUB as the latter was then called; 
the bodies were close enough back then for a hunk to apply cleanly to 
either).

 These days we handle ISA_HAS_MADD_MSUB indirectly through 
GENERATE_MADD_MSUB and in many more places than back at rev. 56471.  We 
also handle TARGET_MAD and ISA_HAS_MACC/ISA_HAS_MSAC/ISA_HAS_MACCHI 
explicitly throughout mips.md, so I think we should simply discard this 
incorrect condition, and then, as ISA_HAS_FP_MADD4_MSUB4 and 
ISA_HAS_NMADD4_NMSUB4 will have become identical, fold the two macros into 
one, perhaps ISA_HAS_FP_MADD4.  And likewise ISA_HAS_FP_MADD3.  Thoughts?

 Back to the change considered here, it was successfully regression-tested 
with the gc

[patch] df-scan: split df_insn_delete for clearer dumps and better speed

2013-02-20 Thread Steven Bosscher
Hello,

The attached patch splits a new function df_insn_info_delete from
df_insn_delete. The original motivation was to get rid of the silly
"deleting insn with uid = ..." messages when re-scanning an insn,
because the mentioned insn isn't deleted at all (it's just rescanned).
But it turns out that there is also a modest but measurable speed-up
(especially at -O0), probably because of avoiding the overhead of
df_grow_bb_info and df_grow_reg_info in common usage of
df_insn_delete.

Bootstrapped&tested on powerpc64-unknown-linux-gnu and on
x86_64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven


df_delete_insn_info.diff
Description: Binary data


Re: [Patch, libfortran] PR 30162 pipe I/O regression with 4.7/4.8

2013-02-20 Thread Jerry DeLisle

On 02/19/2013 02:40 PM, Janne Blomqvist wrote:

Hi,

attached is an attempt to fix writing formatted sequential I/O to a
pipe (The PR was reopened in comment #22, which refers to formatted
I/O so the PR title is incorrect). I think the underlying reason was
that the introduction of the ssize() member function led to a change
in semantics (size of a non-seekable fd is now 0 rather than -1), and
thus in io/open.c test_endfile() we incorrectly concluded that the
file was not positioned at the end, and thus we try to truncate after
writing, leading to the failure.

At the same time, the patch reverts the previous fix; Unformatted
sequential requires seeking due to updating the record markers, I
think that issue can be closed as WONTFIX.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk/4.7?


Patch is OK.

Jerry


closing PR's (was Re: [PATCH ARM iWMMXt 0/5] Improve iWMMXt support)

2013-02-20 Thread Hans-Peter Nilsson
On Mon, 28 Jan 2013, nick clifton wrote:
> > Also, could you close its duplicates, bugs 36798 and 36966?
>
> Sorry no.  I do not actually own these PRs, so I cannot close them. :-(

Sorry if I misinterpret, but it seems a reminder is in order:
magic powers are attached to whome...@gcc.gnu.org accounts in
bugzilla, so when people use them instead of their
whome...@employer.example.com, they are able to close PR's they
haven't created.

brgds, H-P


[PATCH] Don't expand *MEM_REF using extract_bit_field or movmisalign for EXPAND_MEMORY (PR inline-asm/56405)

2013-02-20 Thread Jakub Jelinek
Hi!

If an input operand of inline-asm doesn't allow registers, but allows
memory, we expand it with EXPAND_MEMORY modifier (the only case of using
that modifier).  But the movmisalign code added for 4.6 and especially
the extract_bit_field code added for 4.8 results in getting a REG from
expand_expr called with a *MEM_REF with EXPAND_MEMORY, instead of MEM the
caller expects and can only handle, therefore we ICE.

This patch fixes it by honoring EXPAND_MEMORY in these cases, it is the sole
responsibility of the inline asm writer to do the right thing in there for
misaligned mems (e.g. gcc could pessimistically expect it might be
misaligned, but user knows it will not).

Bootstrapped/regtested on {x86_64,i686,armv7hl,ppc,ppc64,s390,s390x}-linux,
ok for trunk?

2013-02-21  Jakub Jelinek  

PR inline-asm/56405
* expr.c (expand_expr_real_1) : Don't
use movmisalign or extract_bit_field for EXPAND_MEMORY modifier.

* gcc.c-torture/compile/pr56405.c: New test.

--- gcc/expr.c.jj   2013-01-18 18:09:40.0 +0100
+++ gcc/expr.c  2013-02-20 10:29:34.513143634 +0100
@@ -9551,6 +9551,7 @@ expand_expr_real_1 (tree exp, rtx target
set_mem_addr_space (temp, as);
align = get_object_alignment (exp);
if (modifier != EXPAND_WRITE
+   && modifier != EXPAND_MEMORY
&& mode != BLKmode
&& align < GET_MODE_ALIGNMENT (mode)
/* If the target does not have special handling for unaligned
@@ -9639,6 +9640,7 @@ expand_expr_real_1 (tree exp, rtx target
if (TREE_THIS_VOLATILE (exp))
  MEM_VOLATILE_P (temp) = 1;
if (modifier != EXPAND_WRITE
+   && modifier != EXPAND_MEMORY
&& mode != BLKmode
&& align < GET_MODE_ALIGNMENT (mode))
  {
--- gcc/testsuite/gcc.c-torture/compile/pr56405.c.jj2013-02-20 
10:32:17.807250979 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr56405.c   2013-02-20 
10:32:46.963090873 +0100
@@ -0,0 +1,7 @@
+/* PR inline-asm/56405 */
+
+void
+foo (void)
+{
+  asm volatile ("" : "+m" (*(volatile unsigned short *) 0x1001UL));
+}

Jakub


[PATCH] Small *.texi{,info} fixes for texinfo 5.0 (PR bootstrap/56258)

2013-02-20 Thread Jakub Jelinek
Hi!

Currently it is not possible to bootstrap gcc with texinfo 5.0.
This patch attempts to fix the errors that prevent bootstrap, there are tons
of warnings this doesn't address and would be good if somebody more TeXinfo
knowledgeable looked at it.

Bootstrapped/regtested on x86_64-linux and i686-linux both with texinfo 5.0
and older texinfo installed.

The errors were:
../../gcc/doc/invoke.texi:5615: @itemx must follow @item
../../gcc/ada/gnat-style.texi:45: unknown command `hfill'
doc/projects.texi:51: @pxref reference to nonexistent node `Scenarios
  in Projects'
doc/projects.texi:363: @pxref reference to nonexistent node `Organizing
  Projects into Subsystems'
doc/projects.texi:391: @xref reference to nonexistent node `Project
  Extension'
../../../../../libjava/classpath/doc/cp-tools.texinfo:2030: `@end' expected 
`smallexample', but saw `table'
../../../../../libjava/classpath/doc/cp-tools.texinfo:2030: `@end' expected 
`smallexample', but saw `table'
../../../../../libjava/classpath/doc/cp-tools.texinfo:2030: `@end' expected 
`smallexample', but saw `table'

Examples of warnings (many of them occuring many up to several dozens of times):
warning: @anchor should not appear in @heading
warning: @title missing argument
warning: @itemize has text but no @item
warning: node `XXX' is next for `YYY' in menu but not in sectioning
warning: node `XXX' is up for `YYY' in menu but not in sectioning
warning: @itemize has text but no @item
warning: node next `XXX' in menu `YYY' and in sectioning `ZZZ' differ
warning: @tex should only appear at a line beginning
warning: command @tie does not accept arguments

Ok for trunk?

2013-02-21  Jakub Jelinek  

PR bootstrap/56258
* doc/invoke.texi (-fdump-rtl-pro_and_epilogue): Use @item
instead of @itemx.

* gnat-style.texi (@title): Remove @hfill.
* projects.texi: Avoid line wrapping inside of @pxref or
@xref.

* doc/cp-tools.texinfo (Virtual Machine Options): Use just
one @gccoptlist instead of 3 separate ones.

--- gcc/doc/invoke.texi.jj  2013-01-31 22:57:22.0 +0100
+++ gcc/doc/invoke.texi 2013-02-20 13:06:47.516405739 +0100
@@ -5612,7 +5612,7 @@ Dump after the peephole pass.
 @opindex fdump-rtl-postreload
 Dump after post-reload optimizations.
 
-@itemx -fdump-rtl-pro_and_epilogue
+@item -fdump-rtl-pro_and_epilogue
 @opindex fdump-rtl-pro_and_epilogue
 Dump after generating the function prologues and epilogues.
 
--- gcc/ada/gnat-style.texi.jj  2012-08-10 12:57:33.0 +0200
+++ gcc/ada/gnat-style.texi 2013-02-20 13:06:03.042667300 +0100
@@ -42,7 +42,7 @@ Texts.  A copy of the license is include
 @titlepage
 @titlefont{GNAT Coding Style:}
 @sp 1
-@title @hfill A Guide for GNAT Developers
+@title A Guide for GNAT Developers
 @subtitle GNAT, The GNU Ada Compiler
 @versionsubtitle
 @author Ada Core Technologies, Inc.
--- gcc/ada/projects.texi.jj2013-01-04 11:16:24.0 +0100
+++ gcc/ada/projects.texi   2013-02-20 17:48:41.582645159 +0100
@@ -48,8 +48,7 @@ project files allow you to specify:
 @item Source file naming conventions; you can specify these either globally or 
for
   individual compilation units (@pxref{Naming Schemes}).
 @item Change any of the above settings depending on external values, thus 
enabling
-  the reuse of the projects in various @b{scenarios} (@pxref{Scenarios
-  in Projects}).
+  the reuse of the projects in various @b{scenarios} (@pxref{Scenarios in 
Projects}).
 @item Automatically build libraries as part of the build process
   (@pxref{Library Projects}).
 
@@ -360,8 +359,8 @@ locating the specified source files in t
 
 @item For various reasons, it is sometimes useful to have a project with no
   sources (most of the time because the attributes defined in the project
-  file will be reused in other projects, as explained in @pxref{Organizing
-  Projects into Subsystems}. To do this, the attribute
+  file will be reused in other projects, as explained in
+  @pxref{Organizing Projects into Subsystems}. To do this, the attribute
   @emph{Source_Files} is set to the empty list, i.e. @code{()}. Alternatively,
   @emph{Source_Dirs} can be set to the empty list, with the same
   result.
@@ -388,8 +387,9 @@ locating the specified source files in t
   This can be done thanks to the attribute @b{Excluded_Source_Files}
   (or its synonym @b{Locally_Removed_Files}).
   Its value is the list of file names that should not be taken into account.
-  This attribute is often used when extending a project, @xref{Project
-  Extension}. A similar attribute @b{Excluded_Source_List_File} plays the same
+  This attribute is often used when extending a project,
+  @xref{Project Extension}. A similar attribute
+  @b{Excluded_Source_List_File} plays the same
   role but takes the name of file containing file names similarly to
   @code{Source_List_File}.
 
--- libjava/classpath/doc/cp-tools.texinfo.jj   2012-12-20 11:38:51.0 
+0100
+++ libjava/classpath/doc/cp-to