[patch] fix cross building a native compiler

2013-06-12 Thread Matthias Klose
Trying to cross build a native compiler for arm-linux on x86_64 linux currently
fails to build, if the libgmp development files are not available for the build
system. This works with 4.7, but fails with 4.8.

The build fails with:

TARGET_CPU_DEFAULT="" \
HEADERS="auto-build.h ansidecl.h" DEFINES="" \
/bin/bash ../../src/gcc/mkconfig.sh bconfig.h
g++ -c   -g -O2 -DIN_GCC   -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings   -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild
-I../../src/gcc -I../../src/gcc/build -I../../src/gcc/../include
-I../../src/gcc/../libcpp/include  -I../../src/gcc/../libdecnumber
-I../../src/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../src/gcc/../libbacktrace -DCLOOG_INT_GMP\
-o build/genconstants.o ../../src/gcc/genconstants.c
In file included from /usr/include/x86_64-linux-gnu/sys/resource.h:25:0,
 from ../../src/gcc/system.h:395,
 from ../../src/gcc/genconstants.c:28:
/usr/include/x86_64-linux-gnu/bits/resource.h:131:18: error: declaration does
not declare anything [-fpermissive]
In file included from ../../src/gcc/genconstants.c:28:0:
../../src/gcc/system.h:444:23: error: declaration of C function 'void*
sbrk(int)' conflicts with


Searching the archive and ML this is supposed to work according to
PR 35051, as described in 
http://gcc.gnu.org/ml/gcc-patches/2008-02/msg00041.html

When building the auto-build.h, all the configure tests including the system.h
header fail to build with

configure:10361: gcc -c   -I../../../src/gcc -I../../../src/gcc/../include
conftest.c >&5
In file included from ../../../src/gcc/system.h:641:0,
 from conftest.c:116:
/usr/include/gmp.h:69:24: fatal error: gmp-x86_64.h: No such file or directory
compilation terminated.

And at least all the HAVE_DECL_* defines are set to 0 instead of 1.

[side note: can we keep the temporary directory for this? it's a bit
 strange that this directory is removed during the build, and can't
 be looked at. The build continues after the configure, and then
 fails later]

I checked that defining GENERATOR_FILE for this configure step to build the
auto-build.h lets me to cross build the native compiler, and the resulting build
seems to work on the target platform.  However I can't find any other
documentation for GENERATOR_FILE, so I'm unsure if that's the right thing to do.
 Richard did introduce the conditional include of gmp.h based on GENERATOR_FILE
back in 2008, and Steven did define this for some object files built for the
host, but not the build in 2012:

2012-07-08  Steven Bosscher  

* Makefile.in (gengtype-lex.o, gengtype-parse.o, gengtype-state.o,
gengtype.o): Add -DGENERATOR_FILE manually for host gengtype objects.


Is just setting _DGENERATOR_FILE for the auto-build.h build the right thing to
do? If yes, ok for the 4.8 branch and the trunk? I didn't yet figure out why
this works with 4.7.

  Matthias

--- gcc/configure.ac~   2013-06-11 18:34:36.757067080 +0200
+++ gcc/configure.ac2013-06-11 22:51:47.340892778 +0200
@@ -1519,7 +1519,7 @@
*) realsrcdir=../${srcdir};;
esac
saved_CFLAGS="${CFLAGS}"
-   CC="${CC_FOR_BUILD}" CFLAGS="${CFLAGS_FOR_BUILD}" \
+   CC="${CC_FOR_BUILD}" CFLAGS="${CFLAGS_FOR_BUILD} -DGENERATOR_FILE" \
LDFLAGS="${LDFLAGS_FOR_BUILD}" \
${realsrcdir}/configure \
--enable-languages=${enable_languages-all} \


Re: More forwprop for vectors

2013-06-12 Thread Richard Biener
On Tue, Jun 11, 2013 at 9:44 PM, Marc Glisse  wrote:
> On Tue, 11 Jun 2013, Jeff Law wrote:
>
>> On 06/09/13 13:43, Marc Glisse wrote:
>>>
>>> Hello,
>>>
>>> just adapting yet another function so it also works with vectors.
>>>
>>> It seemed convenient to add a new macro. The name sucks (it doesn't
>>> match the semantics of INTEGRAL_TYPE_P), but I didn't want to name it
>>> INTEGER_SCALAR_OR_VECTOR_CONSTANT_P and didn't have any good idea for a
>>> short name.
>>
>> I'd just use a long name.  I can easily see someone getting easily not
>> being aware that INTEGRAL_CST_P returns true for vectors and as a result
>> doing something inappropriate.
>>
>> INTEGER_CST_OR_VECTOR_INTEGER_TYPE_P?
>
>
> Having TYPE in there seems confusing, and
> INTEGER_SCALAR_OR_VECTOR_CONSTANT_P is at least one character shorter ;-)
> Oh, you probably meant INTEGER_CST_OR_VECTOR_INTEGER_CST_P?
>
> Compacting could give INT_OR_VECINT_CST_P (or INTVEC instead of VECINT, I
> don't know which order sounds best).
>
> I don't really mind the name, so if you want
> INTEGER_CST_OR_VECTOR_INTEGER_CST_P that's ok with me.

How about just adding VECTOR_INTEGER_CST_P and using
TREE_CODE (x) == INTEGER_CST || VECTOR_INTEGER_CST_P (x)
in the code?

I suppose it's explicitely not allowing complex integer constants?

Richard.

> Thanks for the comments on the 2 patches,
>
> --
> Marc Glisse


Re: Remove self-assignments

2013-06-12 Thread Richard Biener
On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse  wrote:
> On Tue, 11 Jun 2013, Jeff Law wrote:
>
>> On 06/09/13 10:25, Marc Glisse wrote:
>>>
>>> Hello,
>>>
>>> this patch removes some self-assignments. I don't know if this is the
>>> best way, but it passes a bootstrap and the testsuite on
>>> x86_64-linux-gnu.
>>>
>>> 2013-06-10  Marc Glisse  
>>>
>>>  PR tree-optimization/57361
>>> gcc/
>>>  * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
>>> self-assignment.
>>>
>>> gcc/testsuite/
>>>  * gcc.dg/tree-ssa/pr57361.c: New file.
>>
>> So dse_optimize_stmt will verify the statement does not have volatile
>> operands.
>
>
> operand_equal_p also does it, so we are well covered there ;-)
>
>
>> However, it doesn't verify the statement does not potentially throw (think
>> about a segfault on the store when async exceptions are enabled).  I think
>> you need to test for that explicitly.
>
>
> Hmm, I am not at all familiar with that. Google drowns me in C# and
> javascript links, and grepping through the sources only pointed me to the
> -fasynchronous-unwind-tables flag.
>
> Richard noticed in the PR that expand_assignment already does:
>
>   /* Optimize away no-op moves without side-effects.  */
>   if (operand_equal_p (to, from, 0))
> return;
>
> so it looks like the operand_equal_p test should be sufficient (or the
> compiler already breaks that code).
>
>
>> Is there some reason this won't work in tree-ssa-dce.c?  That gets run
>> more often and these stores may be preventing code motion opportunities, so
>> getting them out of the IL stream as early as possible would be good.
>
>
> In the first version of the patch:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
> I was doing it in fold_stmt_1, which should be called often enough. Richard
> suggested DSE in the PR, but that might be because he had a more subtle test
> in mind. I am certainly ok with moving it to DCE or anywhere else...

DSE looks like the right place to me as we are removing a store.  Yes,
DCE removes a limited set of stores as well, but the way we remove this kind
of store makes it much more suited to DSE.

As of possibly trapping/throwing stores, we do not bother to preserve those
(even with -fnon-call-exceptions).

Thus, the patch is ok with me if you agree with that and with the following
adjustment:

+  /* Self-assignments are zombies.  */
+  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
+  && operand_equal_p (gimple_assign_rhs1 (stmt),
+ gimple_assign_lhs (stmt), 0))

I see no need to compare the codes of the LHS and the RHS, that's redundant
with what operand_equal_p does.

Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff
has no further comments.

Thanks,
Richard.

>> I'd be curious how often this triggers in GCC itself as well.
>
>
> Do you know a convenient way to test that?
>
> --
> Marc Glisse


Tidy up expand_expr_real_1

2013-06-12 Thread Eric Botcazou
The main point is to simplify the MEM_REF case, whose first part is a bit 
convoluted, and replace TREE_TYPE (exp) with 'type' in some other places.
No functional changes whatsoever (and no conflicts with Alan's patch).

Bootstrapped/regtested on x86_64-suse-linux, applied on mainline as obvious.


2013-06-12  Eric Botcazou  

* expr.c (expand_expr_real_1) : Use straight-line flow.
: Use 'type' instead of TREE_TYPE (exp) and tidy up the first
part.  Use straight-line flow at the end.
: Remove superfluous else.
: Use 'type' instead of TREE_TYPE (exp).



-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 199982)
+++ expr.c	(working copy)
@@ -9602,7 +9602,7 @@ expand_expr_real_1 (tree exp, rtx target
 	create_output_operand (&ops[0], NULL_RTX, mode);
 	create_fixed_operand (&ops[1], temp);
 	expand_insn (icode, 2, ops);
-	return ops[0].value;
+	temp = ops[0].value;
 	  }
 	return temp;
   }
@@ -9621,34 +9621,25 @@ expand_expr_real_1 (tree exp, rtx target
 	if (mem_ref_refers_to_non_mem_p (exp))
 	  {
 	HOST_WIDE_INT offset = mem_ref_offset (exp).low;
-	tree bit_offset;
-	tree bftype;
 	base = TREE_OPERAND (base, 0);
 	if (offset == 0
-		&& host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
+		&& host_integerp (TYPE_SIZE (type), 1)
 		&& (GET_MODE_BITSIZE (DECL_MODE (base))
-		== TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)
-	  return expand_expr (build1 (VIEW_CONVERT_EXPR,
-	  TREE_TYPE (exp), base),
+		== TREE_INT_CST_LOW (TYPE_SIZE (type
+	  return expand_expr (build1 (VIEW_CONVERT_EXPR, type, base),
   target, tmode, modifier);
-	bit_offset = bitsize_int (offset * BITS_PER_UNIT);
-	bftype = TREE_TYPE (base);
-	if (TYPE_MODE (TREE_TYPE (exp)) != BLKmode)
-	  bftype = TREE_TYPE (exp);
-	else
+	if (TYPE_MODE (type) == BLKmode)
 	  {
 		temp = assign_stack_temp (DECL_MODE (base),
 	  GET_MODE_SIZE (DECL_MODE (base)));
 		store_expr (base, temp, 0, false);
 		temp = adjust_address (temp, BLKmode, offset);
-		set_mem_size (temp, int_size_in_bytes (TREE_TYPE (exp)));
+		set_mem_size (temp, int_size_in_bytes (type));
 		return temp;
 	  }
-	return expand_expr (build3 (BIT_FIELD_REF, bftype,
-	base,
-	TYPE_SIZE (TREE_TYPE (exp)),
-	bit_offset),
-target, tmode, modifier);
+	exp = build3 (BIT_FIELD_REF, type, base, TYPE_SIZE (type),
+			  bitsize_int (offset * BITS_PER_UNIT));
+	return expand_expr (exp, target, tmode, modifier);
 	  }
 	address_mode = targetm.addr_space.address_mode (as);
 	base = TREE_OPERAND (exp, 0);
@@ -9690,7 +9681,7 @@ expand_expr_real_1 (tree exp, rtx target
 		create_output_operand (&ops[0], NULL_RTX, mode);
 		create_fixed_operand (&ops[1], temp);
 		expand_insn (icode, 2, ops);
-		return ops[0].value;
+		temp = ops[0].value;
 	  }
 	else if (SLOW_UNALIGNED_ACCESS (mode, align))
 	  temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
@@ -10202,7 +10193,8 @@ expand_expr_real_1 (tree exp, rtx target
 	|| modifier == EXPAND_CONST_ADDRESS
 	|| modifier == EXPAND_INITIALIZER)
 	  return op0;
-	else if (target == 0)
+	
+	if (target == 0)
 	  target = gen_reg_rtx (tmode != VOIDmode ? tmode : mode);
 
 	convert_move (target, op0, unsignedp);
@@ -10249,7 +10241,7 @@ expand_expr_real_1 (tree exp, rtx target
   /* If we are converting to BLKmode, try to avoid an intermediate
 	 temporary by fetching an inner memory reference.  */
   if (mode == BLKmode
-	  && TREE_CODE (TYPE_SIZE (TREE_TYPE (exp))) == INTEGER_CST
+	  && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
 	  && TYPE_MODE (TREE_TYPE (treeop0)) != BLKmode
 	  && handled_component_p (treeop0))
   {
@@ -10268,7 +10260,7 @@ expand_expr_real_1 (tree exp, rtx target
 	if (!offset
 	&& (bitpos % BITS_PER_UNIT) == 0
 	&& bitsize >= 0
-	&& compare_tree_int (TYPE_SIZE (TREE_TYPE (exp)), bitsize) == 0)
+	&& compare_tree_int (TYPE_SIZE (type), bitsize) == 0)
 	  {
 	/* See the normal_inner_ref case for the rationale.  */
 	orig_op0
@@ -10309,8 +10301,7 @@ expand_expr_real_1 (tree exp, rtx target
   }
 
   if (!op0)
-	op0 = expand_expr (treeop0,
-			   NULL_RTX, VOIDmode, modifier);
+	op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
 
   /* If the input and output modes are both the same, we are done.  */
   if (mode == GET_MODE (op0))


Re: Remove self-assignments

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Richard Biener wrote:


On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse  wrote:

On Tue, 11 Jun 2013, Jeff Law wrote:


On 06/09/13 10:25, Marc Glisse wrote:


Hello,

this patch removes some self-assignments. I don't know if this is the
best way, but it passes a bootstrap and the testsuite on
x86_64-linux-gnu.

2013-06-10  Marc Glisse  

 PR tree-optimization/57361
gcc/
 * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
self-assignment.

gcc/testsuite/
 * gcc.dg/tree-ssa/pr57361.c: New file.


So dse_optimize_stmt will verify the statement does not have volatile
operands.



operand_equal_p also does it, so we are well covered there ;-)



However, it doesn't verify the statement does not potentially throw (think
about a segfault on the store when async exceptions are enabled).  I think
you need to test for that explicitly.



Hmm, I am not at all familiar with that. Google drowns me in C# and
javascript links, and grepping through the sources only pointed me to the
-fasynchronous-unwind-tables flag.

Richard noticed in the PR that expand_assignment already does:

  /* Optimize away no-op moves without side-effects.  */
  if (operand_equal_p (to, from, 0))
return;

so it looks like the operand_equal_p test should be sufficient (or the
compiler already breaks that code).



Is there some reason this won't work in tree-ssa-dce.c?  That gets run
more often and these stores may be preventing code motion opportunities, so
getting them out of the IL stream as early as possible would be good.



In the first version of the patch:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
I was doing it in fold_stmt_1, which should be called often enough. Richard
suggested DSE in the PR, but that might be because he had a more subtle test
in mind. I am certainly ok with moving it to DCE or anywhere else...


DSE looks like the right place to me as we are removing a store.  Yes,
DCE removes a limited set of stores as well, but the way we remove this kind
of store makes it much more suited to DSE.

As of possibly trapping/throwing stores, we do not bother to preserve those
(even with -fnon-call-exceptions).

Thus, the patch is ok with me if you agree with that and with the following
adjustment:

+  /* Self-assignments are zombies.  */
+  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs (stmt))
+  && operand_equal_p (gimple_assign_rhs1 (stmt),
+ gimple_assign_lhs (stmt), 0))

I see no need to compare the codes of the LHS and the RHS, that's redundant
with what operand_equal_p does.


I was trying to make sure first that it was a pure assignment and thus 
that rhs1 was the only thing to look at, not for instance *a=-*a, but I 
guess in gimple that would be 3 statements. I'll retest without it.



Thus, ok with removing that test if it bootstraps / regtests ok and if Jeff
has no further comments.




I'd be curious how often this triggers in GCC itself as well.


Essentially never. I tried with the fold_stmt version of the patch, and 
libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers. 
Note that the case:

b=*a
*a=b
is already handled by FRE which removes *a=b (copyprop later removes the 
dead b=*a and release_ssa removes the unused variable b), it is only *a=*a 
that wasn't handled. Now that I look at it, it is a bit surprising that:


struct A {int i;};
void f(A*a,A*b){ A c=*b; *a=c; }
void g(A*a,A*b){ *a=*b; }

gives 2 different .optimized gimple:

  c$i_5 = MEM[(const struct A &)b_2(D)];
  MEM[(struct A *)a_3(D)] = c$i_5;

and:

  *a_2(D) = MEM[(const struct A &)b_3(D)];

Aren't they equivalent? And if so, which form should be preferred?

--
Marc Glisse


Re: More forwprop for vectors

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Richard Biener wrote:


On Tue, Jun 11, 2013 at 9:44 PM, Marc Glisse  wrote:

On Tue, 11 Jun 2013, Jeff Law wrote:


On 06/09/13 13:43, Marc Glisse wrote:


Hello,

just adapting yet another function so it also works with vectors.

It seemed convenient to add a new macro. The name sucks (it doesn't
match the semantics of INTEGRAL_TYPE_P), but I didn't want to name it
INTEGER_SCALAR_OR_VECTOR_CONSTANT_P and didn't have any good idea for a
short name.


I'd just use a long name.  I can easily see someone getting easily not
being aware that INTEGRAL_CST_P returns true for vectors and as a result
doing something inappropriate.

INTEGER_CST_OR_VECTOR_INTEGER_TYPE_P?



Having TYPE in there seems confusing, and
INTEGER_SCALAR_OR_VECTOR_CONSTANT_P is at least one character shorter ;-)
Oh, you probably meant INTEGER_CST_OR_VECTOR_INTEGER_CST_P?

Compacting could give INT_OR_VECINT_CST_P (or INTVEC instead of VECINT, I
don't know which order sounds best).

I don't really mind the name, so if you want
INTEGER_CST_OR_VECTOR_INTEGER_CST_P that's ok with me.


How about just adding VECTOR_INTEGER_CST_P and using
TREE_CODE (x) == INTEGER_CST || VECTOR_INTEGER_CST_P (x)
in the code?


That's a bit heavy, but I can live with that.


I suppose it's explicitely not allowing complex integer constants?


Hmm... Thanks, I keep forgetting complex :-(

Testing for CONSTANT_CLASS_P seems sufficient here. Some transformations 
also seem valid for complex, and the others are already restricted by the 
fact that they involve operators like AND or XOR, or because we exit early 
for FLOAT_TYPE_P and FIXED_POINT_TYPE_P. I'll test that (no new macro for 
now then).


--
Marc Glisse


Make symtab ready for duplicated symtab nodes during streaming

2013-06-12 Thread Jan Hubicka
Hi,
this patch is basically preparation for Richard's new tree merging patch.
It allows merging multiple instances of same variable and function decls
prior symtab streaming by forcingly creating duplicate symtab entries as needed.

There are two issues. First symtab maintain hash mapping declarations to symbols
that is not ready for this one to many mapping. This is handled by 
re-populating it 
in lto-symtab after merging.  I have followup patch to avoid its early 
construction
completely saving some extra compile time during LTO.

Second is way we handle resolutions, that are currently recorded into 
resoution_map
that maps decl to particular resolution. It needs to be updated to be input
file sensitive that is done by creating separate resolution maps for every file.
It should also make things slightly faster since all handling is per-file
and thus we should get better locality.

Bootstrapped/regtested x86_64-linux and tested on Firefox build with Richard's
patch. Comitted.

Honza

* lto-symtab.c (lto_symtab_merge_symbols): Populate symtab hashtable.
* cgraph.h (varpool_create_empty_node): Declare.
* lto-cgraph.c (input_node, input_varpool_node): Forcingly create
duplicated nodes.
* symtab.c (symtab_unregister_node): Be lax about missin entries
in node hash.
(symtab_get_node): Update comment.
* varpool.c (varpool_create_empty_node): Break out from ...
(varpool_node_for_decl): ... here.
* lto-streamer.h (lto_file_decl_data): Add RESOLUTION_MAP.

* lto.c (register_resolution): Take lto_file_data argument.
(lto_register_var_decl_in_symtab,
lto_register_function_decl_in_symtab): Update.
(read_cgraph_and_symbols): Update resolution_map handling.

Index: lto-symtab.c
===
--- lto-symtab.c(revision 199988)
+++ lto-symtab.c(working copy)
@@ -573,16 +573,21 @@ lto_symtab_merge_symbols (void)
 {
   symtab_initialize_asm_name_hash ();
 
-  /* Do the actual merging.  */
+  /* Do the actual merging.  
+ At this point we invalidate hash translating decls into symtab nodes
+because after removing one of duplicate decls the hash is not correcly
+updated to the ohter dupliate.  */
   FOR_EACH_SYMBOL (node)
if (lto_symtab_symbol_p (node)
&& node->symbol.next_sharing_asm_name
&& !node->symbol.previous_sharing_asm_name)
  lto_symtab_merge_symbols_1 (node);
 
-  /* Resolve weakref aliases whose target are now in the compilation unit. 
 */
+  /* Resolve weakref aliases whose target are now in the compilation unit. 
 
+also re-populate the hash translating decls into symtab nodes*/
   FOR_EACH_SYMBOL (node)
{
+ cgraph_node *cnode;
  if (!node->symbol.analyzed && node->symbol.alias_target)
{
  symtab_node tgt = symtab_node_for_asm (node->symbol.alias_target);
@@ -591,6 +596,10 @@ lto_symtab_merge_symbols (void)
symtab_resolve_alias (node, tgt);
}
  node->symbol.aux = NULL;
+ if (!(cnode = dyn_cast  (node))
+ || !cnode->clone_of
+ || cnode->clone_of->symbol.decl != cnode->symbol.decl)
+   symtab_insert_node_to_hashtable ((symtab_node)node);
}
 }
 }
Index: cgraph.h
===
--- cgraph.h(revision 199988)
+++ cgraph.h(working copy)
@@ -773,6 +773,7 @@ bool cgraph_maybe_hot_edge_p (struct cgr
 bool cgraph_optimize_for_size_p (struct cgraph_node *);
 
 /* In varpool.c  */
+struct varpool_node *varpool_create_empty_node (void);
 struct varpool_node *varpool_node_for_decl (tree);
 struct varpool_node *varpool_node_for_asm (tree asmname);
 void varpool_mark_needed_node (struct varpool_node *);
Index: lto-cgraph.c
===
--- lto-cgraph.c(revision 199988)
+++ lto-cgraph.c(working copy)
@@ -959,7 +959,14 @@ input_node (struct lto_file_decl_data *f
vNULL, false);
 }
   else
-node = cgraph_get_create_node (fn_decl);
+{
+  /* Declaration of functions can be already merged with a declaration
+from other input file.  We keep cgraph unmerged until after streaming
+of ipa passes is done.  Alays forcingly create a fresh node.  */
+  node = cgraph_create_empty_node ();
+  node->symbol.decl = fn_decl;
+  symtab_register_node ((symtab_node)node);
+}
 
   node->symbol.order = order;
   if (order >= symtab_order)
@@ -1035,7 +1042,14 @@ input_varpool_node (struct lto_file_decl
   order = streamer_read_hwi (ib) + order_base;
   decl_index = streamer_read_uhwi (ib);
   var_decl = lto_file_decl_data_get_var_decl (file_data, decl_index);
-  node = varpool_node_for_decl (var_decl);
+
+  /* Declaration of functions

[PATCH] Fix vect_recog_widen_mult_pattern (PR tree-optimization/57537)

2013-06-12 Thread Jakub Jelinek
Hi!

gcc.dg/vect/slp-widen-mult-half.c fails on ppc64, because
vect_recog_widen_mult_pattern creates with WIDEN_MULT_EXPR
with invalid argument types - it is a HIxHI->SI multiplication,
and rhs1 is properly HImode, but rhs2 is INTEGER_CST with SImode
(which vect_handle_widen_op_by_const verified it fits into HImode).
As the type is incompatible with what it should have been, when vectorizing
it we actually create a VIEW_CONVERT_EXPR of the SImode INTEGER_CST
to HImode, on little endian that magically works correctly, on big endian
we end up with a vector of zeros rather than vector of the desired
multipliers.  Fixed thusly, so far bootstrapped on i686-linux,
regtest there plus bootstraps/regtests on x86_64-linux, powerpc{,64}-linux
and s390{,x}-linux still pending, ok for trunk/4.8 if it succeeds?

2013-06-12  Jakub Jelinek  

PR tree-optimization/57537
* tree-vect-patterns.c (vect_recog_widen_mult_pattern): If
vect_handle_widen_op_by_const, convert oprnd1 to half_type1.

--- gcc/tree-vect-patterns.c.jj 2013-05-17 10:53:10.0 +0200
+++ gcc/tree-vect-patterns.c2013-06-12 09:49:30.151854270 +0200
@@ -640,7 +640,10 @@ vect_recog_widen_mult_pattern (vec

Re: [PATCH] Fix vect_recog_widen_mult_pattern (PR tree-optimization/57537)

2013-06-12 Thread Richard Biener
On Wed, 12 Jun 2013, Jakub Jelinek wrote:

> Hi!
> 
> gcc.dg/vect/slp-widen-mult-half.c fails on ppc64, because
> vect_recog_widen_mult_pattern creates with WIDEN_MULT_EXPR
> with invalid argument types - it is a HIxHI->SI multiplication,
> and rhs1 is properly HImode, but rhs2 is INTEGER_CST with SImode
> (which vect_handle_widen_op_by_const verified it fits into HImode).
> As the type is incompatible with what it should have been, when vectorizing
> it we actually create a VIEW_CONVERT_EXPR of the SImode INTEGER_CST
> to HImode, on little endian that magically works correctly, on big endian
> we end up with a vector of zeros rather than vector of the desired
> multipliers.  Fixed thusly, so far bootstrapped on i686-linux,
> regtest there plus bootstraps/regtests on x86_64-linux, powerpc{,64}-linux
> and s390{,x}-linux still pending, ok for trunk/4.8 if it succeeds?

Ok.

Thanks,
Richard.

> 2013-06-12  Jakub Jelinek  
> 
>   PR tree-optimization/57537
>   * tree-vect-patterns.c (vect_recog_widen_mult_pattern): If
>   vect_handle_widen_op_by_const, convert oprnd1 to half_type1.
> 
> --- gcc/tree-vect-patterns.c.jj   2013-05-17 10:53:10.0 +0200
> +++ gcc/tree-vect-patterns.c  2013-06-12 09:49:30.151854270 +0200
> @@ -640,7 +640,10 @@ vect_recog_widen_mult_pattern (vec&& vect_handle_widen_op_by_const (last_stmt, MULT_EXPR, oprnd1,
>   &oprnd0, stmts, type,
>   &half_type0, def_stmt0))
> -half_type1 = half_type0;
> + {
> +   half_type1 = half_type0;
> +   oprnd1 = fold_convert (half_type1, oprnd1);
> + }
>else
>  return NULL;
>  }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend


Re: Remove self-assignments

2013-06-12 Thread Richard Biener
On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse  wrote:
> On Wed, 12 Jun 2013, Richard Biener wrote:
>
>> On Tue, Jun 11, 2013 at 9:30 PM, Marc Glisse  wrote:
>>>
>>> On Tue, 11 Jun 2013, Jeff Law wrote:
>>>
 On 06/09/13 10:25, Marc Glisse wrote:
>
>
> Hello,
>
> this patch removes some self-assignments. I don't know if this is the
> best way, but it passes a bootstrap and the testsuite on
> x86_64-linux-gnu.
>
> 2013-06-10  Marc Glisse  
>
>  PR tree-optimization/57361
> gcc/
>  * tree-ssa-dse.c (dse_possible_dead_store_p): Handle
> self-assignment.
>
> gcc/testsuite/
>  * gcc.dg/tree-ssa/pr57361.c: New file.


 So dse_optimize_stmt will verify the statement does not have volatile
 operands.
>>>
>>>
>>>
>>> operand_equal_p also does it, so we are well covered there ;-)
>>>
>>>
 However, it doesn't verify the statement does not potentially throw
 (think
 about a segfault on the store when async exceptions are enabled).  I
 think
 you need to test for that explicitly.
>>>
>>>
>>>
>>> Hmm, I am not at all familiar with that. Google drowns me in C# and
>>> javascript links, and grepping through the sources only pointed me to the
>>> -fasynchronous-unwind-tables flag.
>>>
>>> Richard noticed in the PR that expand_assignment already does:
>>>
>>>   /* Optimize away no-op moves without side-effects.  */
>>>   if (operand_equal_p (to, from, 0))
>>> return;
>>>
>>> so it looks like the operand_equal_p test should be sufficient (or the
>>> compiler already breaks that code).
>>>
>>>
 Is there some reason this won't work in tree-ssa-dce.c?  That gets run
 more often and these stores may be preventing code motion opportunities,
 so
 getting them out of the IL stream as early as possible would be good.
>>>
>>>
>>>
>>> In the first version of the patch:
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57303#c6
>>> I was doing it in fold_stmt_1, which should be called often enough.
>>> Richard
>>> suggested DSE in the PR, but that might be because he had a more subtle
>>> test
>>> in mind. I am certainly ok with moving it to DCE or anywhere else...
>>
>>
>> DSE looks like the right place to me as we are removing a store.  Yes,
>> DCE removes a limited set of stores as well, but the way we remove this
>> kind
>> of store makes it much more suited to DSE.
>>
>> As of possibly trapping/throwing stores, we do not bother to preserve
>> those
>> (even with -fnon-call-exceptions).
>>
>> Thus, the patch is ok with me if you agree with that and with the
>> following
>> adjustment:
>>
>> +  /* Self-assignments are zombies.  */
>> +  if (gimple_assign_rhs_code (stmt) == TREE_CODE (gimple_assign_lhs
>> (stmt))
>> +  && operand_equal_p (gimple_assign_rhs1 (stmt),
>> + gimple_assign_lhs (stmt), 0))
>>
>> I see no need to compare the codes of the LHS and the RHS, that's
>> redundant
>> with what operand_equal_p does.
>
>
> I was trying to make sure first that it was a pure assignment and thus that
> rhs1 was the only thing to look at, not for instance *a=-*a, but I guess in
> gimple that would be 3 statements. I'll retest without it.
>
>
>> Thus, ok with removing that test if it bootstraps / regtests ok and if
>> Jeff
>> has no further comments.
>
>
>
 I'd be curious how often this triggers in GCC itself as well.
>
>
> Essentially never. I tried with the fold_stmt version of the patch, and
> libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers.
> Note that the case:
> b=*a
> *a=b
> is already handled by FRE which removes *a=b (copyprop later removes the
> dead b=*a and release_ssa removes the unused variable b), it is only *a=*a
> that wasn't handled. Now that I look at it, it is a bit surprising that:
>
> struct A {int i;};
> void f(A*a,A*b){ A c=*b; *a=c; }
> void g(A*a,A*b){ *a=*b; }
>
> gives 2 different .optimized gimple:
>
>   c$i_5 = MEM[(const struct A &)b_2(D)];
>   MEM[(struct A *)a_3(D)] = c$i_5;
>
> and:
>
>   *a_2(D) = MEM[(const struct A &)b_3(D)];
>
> Aren't they equivalent? And if so, which form should be preferred?

Well, the first is optimized by SRA to copy element-wise and thus the
loads/stores have is_gimple_reg_type () which require separate loads/stores.
The second is an aggregate copy where we cannot generate SSA temporaries
for the result of the load (!is_gimple_reg_type ()) and thus we are required
to have a single statement.

One of my pending GIMPLE re-org tasks is to always separate loads and
stores and allow SSA names of aggregate type, thus we'd have

 tem_1 = MEM[(const struct A &)b_3(D)];
 *a_2(D) = tem_1;

even for the 2nd case.  That solves the fact that we are missing an
aggregate copy propagation pass quite nicely.

Yes, you have to watch for not creating (too many) overlapping life-ranges
as out-of-SSA won't be able to assign the temporary aggregate SSA names
to registers but possibly has to allocate costly sta

Re: More forwprop for vectors

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Marc Glisse wrote:


On Wed, 12 Jun 2013, Richard Biener wrote:


I suppose it's explicitely not allowing complex integer constants?


Hmm... Thanks, I keep forgetting complex :-(


Do we want A+~A -> -1-i for integer complex types? Is using BIT_NOT_EXPR 
on them even legal? Currently we restrict the transform to INTEGRAL_TYPE_P 
(TREE_TYPE (rhs1)), but it looks like any type on which BIT_NOT_EXPR is 
legal should be ok (with an all_ones constant, not the current minus_one 
constant).


--
Marc Glisse


Re: [AArch64] Fix move_lo_quad_ move

2013-06-12 Thread Marcus Shawcroft
OK
/Marcus

On 11 June 2013 14:43, Sofiane Naci  wrote:
> Hi,
>
> This patch fixes a bug in the move_lo_quad_ pattern.
>
> The pattern, shown below, issues a scalar MOV instruction for vector modes:
>
> (define_insn "move_lo_quad_"
>   [(set (match_operand:VQ 0 "register_operand" "=w,w,w")
> (vec_concat:VQ
>   (match_operand: 1 "register_operand" "w,r,r")
>   (vec_duplicate: (const_int 0]
>   "TARGET_SIMD"
>   "@
>mov\\t%d0, %d1
>fmov\t%d0, %1
>...
>
> This is fixed by using DUP for the first alternative instead.
>
> This passes the full regression test suite in aarch64-elf.
>
> OK for trunk?
>
> -
> Thanks
> Sofiane


Re: SPE detection broken on Linux (bits/predefs.h: No such file or directory)

2013-06-12 Thread Olivier Hainque

On Jun 11, 2013, at 16:50 , David Edelsohn  wrote:
>> I solved this in gcc/config/rs6000/t-linux by replacing the line
>> 
>> MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring
>> rs6000/e500-double.h, $(tm_file_list)),,v1)
>> 
>> with
>> 
>> MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring
>> 8548,$(with_cpu)),,v1)
> 
> Olivier was the person who removed e500-double.h and added 8548
> support.  I would like to hear his and Eric's comment since they seem
> to be doing the most work on e500 at the moment.

 The suggested update is in line with this part of the
 change we did at the time:

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 187145)
+++ gcc/config.gcc  (working copy)
@@ -2828,6 +2828,13 @@
 mips*-*-vxworks)
   with_arch=mips2
   ;;
+powerpc*-*-*spe*)
+  if test x$enable_e500_double = xyes; then
+ with_cpu=8548
+  else
+ with_cpu=8540
+  fi   
+  ;;
 sparc-leon*-*)
   with_cpu=v8;
   ;;
@@ -3509,11 +3516,6 @@
c_target_objs="${c_target_objs} rs6000-c.o"
cxx_target_objs="${cxx_target_objs} rs6000-c.o"
tmake_file="rs6000/t-rs6000 ${tmake_file}"
-
-if test x$enable_e500_double = xyes
-then
-tm_file="$tm_file rs6000/e500-double.h"
-fi
;;
 
sh[123456ble]*-*-* | sh-*-*)

 so looks correct to me. Thanks!

 For the records, a summary of the history that led to the change,
 plus a first version of the patches is available at

   http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00464.html

 An adjusted version of the whole set of patches, eventually
 checked-in was provided at

   http://gcc.gnu.org/ml/gcc-patches/2012-05/msg01021.html

 With Kind Regards,

 Olivier



[PATCH] Optimize LTO streaming core routine

2013-06-12 Thread Richard Biener

This optimizes the very core streaming routines, 
streamer_write_char_stream and streamer_write_uhwi_stream and
streamer_write_hwi_stream.

In streamer_write_char_stream you
can notice that writing the char possibly clobbers the pointer
(and everything else) because it uses alias set zero and may
point to an arbitrary location.  Thus we have to CSE
current_pointer manually here.

This also leads to very inefficent loops in
streamer_write_uhwi_stream and streamer_write_hwi_stream.
To optimize them we have to manually inline streamer_write_char_stream
and apply loop invariant motion and unswitching.

In streamer_write_hwi_stream we do the same but also note that
the

!   more = !((work == 0 && (byte & 0x40) == 0)
!  || (work == -1 && (byte & 0x40) != 0));

test is very inefficent.  We can optimize that if we split
the shifting of work into two pieces like

!   /* If the lower 7-bits are sign-extended 0 or -1 we are finished.  
*/
!   work >>= 6;
!   more = !(work == 0 || work == -1);
if (more)
!   {
! /* More bits to follow.  */
! work >>= 1;
! byte |= 0x80;

which results in a very nice core loop

.L21:
movq%rbp, %rsi
movl%ebp, %ecx
sarq$6, %rsi
andl$127, %ecx
addq$1, %rsi
cmpq$1, %rsi
jbe .L26
orb $-128, %cl
sarq$7, %rbp
addl$1, %r12d
movb%cl, (%rax)
addq$1, %rax
subl$1, %edx
jne .L21

and threaded tail for the !more case:

.L26:
movb%cl, (%rax)
subl$1, %edx
addq$1, %rax
addl$1, %r12d
movq%rax, 16(%rbx)
addl%r12d, 32(%rbx)
movl%edx, 24(%rbx)
...
ret

(above produced by g++ 4.6 with -O2).  Compared with what is
there before the patch:

...
movl24(%rdi), %eax
jmp .L48
.p2align 4,,10
.p2align 3
.L52:
xorl%r13d, %r13d
testb   $64, %r12b
je  .L46
.L45:
orb $-128, %r12b
movl$1, %r13d
.L46:
testl   %eax, %eax
jne .L47
movq%rbx, %rdi
call_Z16lto_append_blockP17lto_output_stream
.L47:
movq16(%rbx), %rax
movb%r12b, (%rax)
movl24(%rbx), %eax
addq$1, 16(%rbx)
addl$1, 32(%rbx)
subl$1, %eax
testl   %r13d, %r13d
movl%eax, 24(%rbx)
je  .L51
.L48:
movl%ebp, %r12d
sarq$7, %rbp
andl$127, %r12d
testq   %rbp, %rbp
je  .L52
cmpq$-1, %rbp
jne .L45
xorl%r13d, %r13d
testb   $64, %r12b
jne .L46
jmp .L45
.p2align 4,,10
.p2align 3
.L51:
addq$8, %rsp
...
ret

that's a _lot_ better.

And hopefully get's streaming down in the profile somewhat.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied
to trunk.

Richard.

2013-06-12  Richard Biener  

* data-streamer.h (streamer_write_char_stream): CSE
obs->current_pointer.
* data-streamer-out.c (streamer_write_uhwi_stream): Inline
streamer_write_char_stream manually and optimize the resulting loop.
(streamer_write_hwi_stream): Likewise.

Index: gcc/data-streamer.h
===
*** gcc/data-streamer.h (revision 199935)
--- gcc/data-streamer.h (working copy)
*** streamer_write_char_stream (struct lto_o
*** 183,190 
  lto_append_block (obs);
  
/* Write the actual character.  */
!   *obs->current_pointer = c;
!   obs->current_pointer++;
obs->total_size++;
obs->left_in_block--;
  }
--- 183,191 
  lto_append_block (obs);
  
/* Write the actual character.  */
!   char *current_pointer = obs->current_pointer;
!   *(current_pointer++) = c;
!   obs->current_pointer = current_pointer;
obs->total_size++;
obs->left_in_block--;
  }
Index: gcc/data-streamer-out.c
===
*** gcc/data-streamer-out.c (revision 199935)
--- gcc/data-streamer-out.c (working copy)
*** void
*** 187,192 
--- 187,197 
  streamer_write_uhwi_stream (struct lto_output_stream *obs,
  unsigned HOST_WIDE_INT work)
  {
+   if (obs->left_in_block == 0)
+ lto_append_block (obs);
+   char *current_pointer = obs->current_pointer;
+   unsigned int left_in_block = obs->left_in_block;
+   unsigned int size = 0;
do
  {
unsigned int byte = (work & 0x7f);
*** streamer_write_uhwi_stream (struct lto_o
*** 195,203 
/* More bytes to follow.  */
byte |= 0x80;
  
!   streamer_write_char_stream (obs, byte);
  }
!   while (work != 0);
  }
  
  
--- 200,233 
/* More bytes to follow.  */
byte 

[PATCH, PR 57539] Fix refdesc remapping during inlining

2013-06-12 Thread Martin Jambor
Hi,

PR 57539 revealed two problems with remapping reference descriptors
during cloning of trees of inlined call graph nodes.  First, when
indirect inlining is involved, we happily remove the reference
descriptor itself by calling ipa_free_edge_args_substructures in
ipa_propagate_indirect_call_infos.  Second, the current remapping code
does not work because the global.inlined_to field of the destination
caller is not yet set.

The patch below fixes the first problem by not calling the freeing
function and the second one by making cgraph_clone_node set the
required field prior to calling any duplication hooks (which is the
only place where we have the pair of corresponding source and
destination edge at our disposal so the duplication/remapping has to
happen there).  I have also shortened the lists of corresponding
references and cleared up the search loop a little.

I'll post a testcase in a separate patch.  This one bootstrapped and
passed testsuite on x86_64-linux without any issues.  OK for trunk?

Thanks,

Martin


2013-06-10  Martin Jambor  

PR tree-optimization/57539
* cgraphclones.c (cgraph_clone_node): Add parameter new_inlined_to, set
global.inlined_to of the new node to it.  All callers changed.
* ipa-inline-transform.c (clone_inlined_nodes): New variable
inlining_into, pass it to cgraph_clone_node.
* ipa-prop.c (ipa_propagate_indirect_call_infos): Do not call
ipa_free_edge_args_substructures.
(ipa_edge_duplication_hook): Only add edges from inlined nodes to
rdesc linked list.  Do not assert rdesc edges have inlined caller.
Assert we have found an rdesc in the rdesc list.

Index: src/gcc/cgraph.h
===
--- src.orig/gcc/cgraph.h
+++ src/gcc/cgraph.h
@@ -707,7 +707,7 @@ struct cgraph_edge * cgraph_clone_edge (
unsigned, gcov_type, int, bool);
 struct cgraph_node * cgraph_clone_node (struct cgraph_node *, tree, gcov_type,
int, bool, vec,
-   bool);
+   bool, struct cgraph_node *);
 tree clone_function_name (tree decl, const char *);
 struct cgraph_node * cgraph_create_virtual_clone (struct cgraph_node *old_node,
  vec,
Index: src/gcc/cgraphclones.c
===
--- src.orig/gcc/cgraphclones.c
+++ src/gcc/cgraphclones.c
@@ -167,13 +167,19 @@ cgraph_clone_edge (struct cgraph_edge *e
function's profile to reflect the fact that part of execution is handled
by node.  
When CALL_DUPLICATOIN_HOOK is true, the ipa passes are acknowledged about
-   the new clone. Otherwise the caller is responsible for doing so later.  */
+   the new clone. Otherwise the caller is responsible for doing so later.
+
+   If the new node is being inlined into another one, NEW_INLINED_TO should be
+   the outline function the new one is (even indirectly) inlined to.  All hooks
+   will see this in node's global.inlined_to, when invoked.  Can be NULL if the
+   node is not inlined.  */
 
 struct cgraph_node *
 cgraph_clone_node (struct cgraph_node *n, tree decl, gcov_type count, int freq,
   bool update_original,
   vec redirect_callers,
-  bool call_duplication_hook)
+  bool call_duplication_hook,
+  struct cgraph_node *new_inlined_to)
 {
   struct cgraph_node *new_node = cgraph_create_empty_node ();
   struct cgraph_edge *e;
@@ -195,6 +201,7 @@ cgraph_clone_node (struct cgraph_node *n
   new_node->symbol.externally_visible = false;
   new_node->local.local = true;
   new_node->global = n->global;
+  new_node->global.inlined_to = new_inlined_to;
   new_node->rtl = n->rtl;
   new_node->count = count;
   new_node->frequency = n->frequency;
@@ -307,7 +314,7 @@ cgraph_create_virtual_clone (struct cgra
 
   new_node = cgraph_clone_node (old_node, new_decl, old_node->count,
CGRAPH_FREQ_BASE, false,
-   redirect_callers, false);
+   redirect_callers, false, NULL);
   /* Update the properties.
  Make clone visible only within this translation unit.  Make sure
  that is not weak also.
Index: src/gcc/ipa-inline-transform.c
===
--- src.orig/gcc/ipa-inline-transform.c
+++ src/gcc/ipa-inline-transform.c
@@ -132,6 +132,13 @@ void
 clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
 bool update_original, int *overall_size)
 {
+  struct cgraph_node *inlining_into;
+
+  if (e->caller->global.inlined_to)
+inlining_into = e->caller->global.inlined_to;
+  else
+inlining_into = e->caller;
+
   if (duplicate)
 {
   /* We may eliminate the need for out-of-line copy to be

Re: [PATCH, PR 57539] Testcase produced by multidelta and indent

2013-06-12 Thread Martin Jambor
Hi,

On Wed, Jun 12, 2013 at 01:59:29PM +0200, Martin Jambor wrote:
> 2013-06-10  Martin Jambor  
> 
>   PR tree-optimization/57539
>   * cgraphclones.c (cgraph_clone_node): Add parameter new_inlined_to, set
>   global.inlined_to of the new node to it.  All callers changed.
>   * ipa-inline-transform.c (clone_inlined_nodes): New variable
>   inlining_into, pass it to cgraph_clone_node.
>   * ipa-prop.c (ipa_propagate_indirect_call_infos): Do not call
>   ipa_free_edge_args_substructures.
>   (ipa_edge_duplication_hook): Only add edges from inlined nodes to
>   rdesc linked list.  Do not assert rdesc edges have inlined caller.
>   Assert we have found an rdesc in the rdesc list.

Creating a testcase for this bug from scratch is not particularly easy
or reliable because it requires inliner to build up a particular data
structure (and then inline it again), which depends on inlining order
which can easily change.  So I did not want to spend too much time on
it.

However, by running multidelta (several times), indent, multidelta and
indent again on the testcase supplied by the reporter I quickly came
up with the following beast.  The downside is that I have very little
understanding what the testcase does with all potential problems in
the future.  Thus I'm not sure if we want such a testcase in the
testsuite at all.  Nevertheless, it currently tests for the bug, does
not warn (without any -W options) and I'll be fine with reverting it
should any problems arise.

So, is it also OK for the trunk?

Thanks,

Martin


2013-06-11  Martin Jambor  

testsuite/
PR tree-optimization/57539
* gcc.dg/ipa/pr57539.c: New test.

Index: src/gcc/testsuite/gcc.dg/ipa/pr57539.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/pr57539.c
@@ -0,0 +1,218 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+typedef long unsigned int size_t;
+typedef struct
+{
+}
+box;
+typedef struct
+{
+}
+textpara_t;
+typedef struct _dtlink_s Dtlink_t;
+typedef struct _dtdisc_s Dtdisc_t;
+typedef struct _dtmethod_s Dtmethod_t;
+typedef struct _dt_s Dt_t;
+typedef void *(*Dtmemory_f) (Dt_t *, void *, size_t, Dtdisc_t *);
+typedef void *(*Dtsearch_f) (Dt_t *, void *, int);
+typedef void *(*Dtmake_f) (Dt_t *, void *, Dtdisc_t *);
+typedef void (*Dtfree_f) (Dt_t *, void *, Dtdisc_t *);
+typedef int (*Dtcompar_f) (Dt_t *, void *, void *, Dtdisc_t *);
+typedef unsigned int (*Dthash_f) (Dt_t *, void *, Dtdisc_t *);
+typedef int (*Dtevent_f) (Dt_t *, int, void *, Dtdisc_t *);
+struct _dtlink_s
+{
+  Dtlink_t *right;
+};
+struct _dtdisc_s
+{
+  int key;
+  int size;
+  int link;
+  Dtmake_f makef;
+  Dtfree_f freef;
+  Dtcompar_f comparf;
+  Dthash_f hashf;
+  Dtmemory_f memoryf;
+  Dtevent_f eventf;
+};
+struct _dt_s
+{
+  Dtsearch_f searchf;
+};
+extern Dtmethod_t *Dtobag;
+extern Dt_t *dtopen (Dtdisc_t *, Dtmethod_t *);
+extern Dtlink_t *dtflatten (Dt_t *);
+typedef struct Agobj_s Agobj_t;
+typedef struct Agraph_s Agraph_t;
+typedef struct Agnode_s Agnode_t;
+typedef struct Agedge_s Agedge_t;
+typedef struct Agdesc_s Agdesc_t;
+typedef struct Agdisc_s Agdisc_t;
+typedef struct Agrec_s Agrec_t;
+struct Agobj_s
+{
+  Agrec_t *data;
+};
+struct Agdesc_s
+{
+};
+extern Agraph_t *agopen (char *name, Agdesc_t desc, Agdisc_t * disc);
+extern Agnode_t *agfstnode (Agraph_t * g);
+extern Agnode_t *agnxtnode (Agraph_t * g, Agnode_t * n);
+extern Agedge_t *agedge (Agraph_t * g, Agnode_t * t, Agnode_t * h, char *name,
+int createflag);
+extern Agedge_t *agfstout (Agraph_t * g, Agnode_t * n);
+extern Agedge_t *agnxtout (Agraph_t * g, Agedge_t * e);
+extern Agdesc_t Agdirected, Agstrictdirected, Agundirected,
+  Agstrictundirected;
+typedef struct Agraph_s graph_t;
+typedef struct Agnode_s node_t;
+typedef struct Agedge_s edge_t;
+typedef union inside_t
+{
+  unsigned short minlen;
+}
+Agedgeinfo_t;
+extern void *gmalloc (size_t);
+typedef enum
+{ AM_NONE, AM_VOR, AM_SCALE, AM_NSCALE, AM_SCALEXY, AM_PUSH, AM_PUSHPULL,
+AM_ORTHO, AM_ORTHO_YX, AM_ORTHOXY, AM_ORTHOYX, AM_PORTHO, AM_PORTHO_YX,
+AM_PORTHOXY, AM_PORTHOYX, AM_COMPRESS, AM_VPSC, AM_IPSEP, AM_PRISM }
+adjust_mode;
+typedef struct nitem
+{
+  Dtlink_t link;
+  int val;
+  node_t *cnode;
+  box bb;
+}
+nitem;
+typedef int (*distfn) (box *, box *);
+typedef int (*intersectfn) (nitem *, nitem *);
+static int
+cmpitem (Dt_t * d, int *p1, int *p2, Dtdisc_t * disc)
+{
+}
+static Dtdisc_t constr =
+  { __builtin_offsetof (nitem, val), sizeof (int), __builtin_offsetof (nitem,
+  link),
+((Dtmake_f) 0), ((Dtfree_f) 0), (Dtcompar_f) cmpitem, ((Dthash_f) 0), 
((Dtmemory_f) 0),
+((Dtevent_f) 0) };
+static int
+distX (box * b1, box * b2)
+{
+}
+
+static int
+intersectY0 (nitem * p, nitem * q)
+{
+}
+
+static int
+intersectY (nitem * p, nitem * q)
+{
+}
+
+static void
+mapGraphs (graph_t * g, graph_t *

[PATCH, 4.8, PR57358] Check if optimizing in parm_ref_data_preserved_p

2013-06-12 Thread Martin Jambor
Hi,

this is the simplest fix for the PR which happens because there is no
VDEF on a stmt if a particular function is not optimized.  I'd like to
fix the bug with it on the branch.  Bootstrapped and tested on
x86_64-linux.  OK?

Thanks,

Martin


2013-06-11  Martin Jambor  

PR tree-optimization/57358
* ipa-prop.c (parm_ref_data_preserved_p): Always return true when
not optimizing.

testsuite/
* gcc.dg/ipa/pr57358.c: New test.

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 53cd5ed..c62dc68 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -678,13 +678,19 @@ parm_ref_data_preserved_p (struct param_analysis_info 
*parm_ainfo,
   bool modified = false;
   ao_ref refd;
 
-  gcc_checking_assert (gimple_vuse (stmt));
   if (parm_ainfo && parm_ainfo->ref_modified)
 return false;
 
-  ao_ref_init (&refd, ref);
-  walk_aliased_vdefs (&refd, gimple_vuse (stmt), mark_modified, &modified,
- NULL);
+  if (optimize)
+{
+  gcc_checking_assert (gimple_vuse (stmt));
+  ao_ref_init (&refd, ref);
+  walk_aliased_vdefs (&refd, gimple_vuse (stmt), mark_modified, &modified,
+ NULL);
+}
+  else
+modified = true;
+
   if (parm_ainfo && modified)
 parm_ainfo->ref_modified = true;
   return !modified;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr57358.c 
b/gcc/testsuite/gcc.dg/ipa/pr57358.c
new file mode 100644
index 000..c83396f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr57358.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct t { void (*func)(void*); };
+void test_func(struct t* a) __attribute__((optimize("O0")));
+void test_func(struct t* a)
+{
+  a->func(0);
+}


[gomp4] Remove some useless use omp_lib_kinds lines from omp_lib.f90

2013-06-12 Thread Jakub Jelinek
Hi!

I've noticed we have use omp_lib_kinds even in interfaces which really don't
need those, are they aren't using anything from omp_lib_kinds.

Fixed thusly, committed to gomp-4_0-branch.

2013-06-12  Jakub Jelinek  

* omp_lib.f90.in (omp_get_dynamic, omp_get_nested,
omp_in_parallel, omp_get_max_threads, omp_get_num_procs,
omp_get_num_threads, omp_get_thread_num, omp_get_thread_limit,
omp_set_max_active_levels, omp_get_max_active_levels,
omp_get_level, omp_get_ancestor_thread_num,
omp_get_team_size, omp_get_active_level, omp_in_final,
omp_get_cancellation, omp_get_default_device,
omp_get_num_devices, omp_get_num_teams, omp_get_team_num): Remove
useless use omp_lib_kinds.

--- libgomp/omp_lib.f90.in.jj   2013-04-10 12:34:26.0 +0200
+++ libgomp/omp_lib.f90.in  2013-06-12 14:13:46.094429932 +0200
@@ -129,21 +129,18 @@
 
 interface
   function omp_get_dynamic ()
-use omp_lib_kinds
 logical (4) :: omp_get_dynamic
   end function omp_get_dynamic
 end interface
 
 interface
   function omp_get_nested ()
-use omp_lib_kinds
 logical (4) :: omp_get_nested
   end function omp_get_nested
 end interface
 
 interface
   function omp_in_parallel ()
-use omp_lib_kinds
 logical (4) :: omp_in_parallel
   end function omp_in_parallel
 end interface
@@ -158,28 +155,24 @@
 
 interface
   function omp_get_max_threads ()
-use omp_lib_kinds
 integer (4) :: omp_get_max_threads
   end function omp_get_max_threads
 end interface
 
 interface
   function omp_get_num_procs ()
-use omp_lib_kinds
 integer (4) :: omp_get_num_procs
   end function omp_get_num_procs
 end interface
 
 interface
   function omp_get_num_threads ()
-use omp_lib_kinds
 integer (4) :: omp_get_num_threads
   end function omp_get_num_threads
 end interface
 
 interface
   function omp_get_thread_num ()
-use omp_lib_kinds
 integer (4) :: omp_get_thread_num
   end function omp_get_thread_num
 end interface
@@ -232,44 +225,37 @@
 
 interface
   function omp_get_thread_limit ()
-use omp_lib_kinds
 integer (4) :: omp_get_thread_limit
   end function omp_get_thread_limit
 end interface
 
 interface omp_set_max_active_levels
   subroutine omp_set_max_active_levels (max_levels)
-use omp_lib_kinds
 integer (4), intent (in) :: max_levels
   end subroutine omp_set_max_active_levels
   subroutine omp_set_max_active_levels_8 (max_levels)
-use omp_lib_kinds
 integer (8), intent (in) :: max_levels
   end subroutine omp_set_max_active_levels_8
 end interface
 
 interface
   function omp_get_max_active_levels ()
-use omp_lib_kinds
 integer (4) :: omp_get_max_active_levels
   end function omp_get_max_active_levels
 end interface
 
 interface
   function omp_get_level ()
-use omp_lib_kinds
 integer (4) :: omp_get_level
   end function omp_get_level
 end interface
 
 interface omp_get_ancestor_thread_num
   function omp_get_ancestor_thread_num (level)
-use omp_lib_kinds
 integer (4), intent (in) :: level
 integer (4) :: omp_get_ancestor_thread_num
   end function omp_get_ancestor_thread_num
   function omp_get_ancestor_thread_num_8 (level)
-use omp_lib_kinds
 integer (8), intent (in) :: level
 integer (4) :: omp_get_ancestor_thread_num_8
   end function omp_get_ancestor_thread_num_8
@@ -277,12 +263,10 @@
 
 interface omp_get_team_size
   function omp_get_team_size (level)
-use omp_lib_kinds
 integer (4), intent (in) :: level
 integer (4) :: omp_get_team_size
   end function omp_get_team_size
   function omp_get_team_size_8 (level)
-use omp_lib_kinds
 integer (8), intent (in) :: level
 integer (4) :: omp_get_team_size_8
   end function omp_get_team_size_8
@@ -290,21 +274,18 @@
 
 interface
   function omp_get_active_level ()
-use omp_lib_kinds
 integer (4) :: omp_get_active_level
   end function omp_get_active_level
 end interface
 
 interface
   function omp_in_final ()
-use omp_lib_kinds
 logical (4) :: omp_in_final
   end function omp_in_final
 end interface
 
 interface
   function omp_get_cancellation ()
-

[gomp4] Add omp_is_initial_device

2013-06-12 Thread Jakub Jelinek
Hi!

This adds another libgomp entry-point, the function is supposed to return
1/.true. if running on host and 0/.false. when running on the accelerator.
So far we always run on the host.

Tested on x86_64-linux, committed to gomp-4_0-branch.

2013-06-12  Jakub Jelinek  

* fortran.c (omp_is_initial_device): Add ialias_redirect.
(omp_is_initial_device_): New function.
* omp_lib.f90.in (omp_is_initial_device): New interface.
* omp.h.in (omp_is_initial_device): New prototype.
* libgomp.map (omp_is_initial_device, omp_is_initial_device_):
Export @@OMP_4.0.
* env.c (omp_is_initial_device): New function.  Add ialias for it.
* omp_lib.h.in (omp_is_initial_device): New external.

--- libgomp/fortran.c.jj2013-04-10 12:05:38.0 +0200
+++ libgomp/fortran.c   2013-06-12 13:59:19.552731481 +0200
@@ -72,6 +72,7 @@ ialias_redirect (omp_get_default_device)
 ialias_redirect (omp_get_num_devices)
 ialias_redirect (omp_get_num_teams)
 ialias_redirect (omp_get_team_num)
+ialias_redirect (omp_is_initial_device)
 #endif
 
 #ifndef LIBGOMP_GNU_SYMBOL_VERSIONING
@@ -485,3 +486,9 @@ omp_get_team_num_ (void)
 {
   return omp_get_team_num ();
 }
+
+int32_t
+omp_is_initial_device_ (void)
+{
+  return omp_is_initial_device ();
+}
--- libgomp/omp_lib.f90.in.jj   2013-04-10 12:34:26.0 +0200
+++ libgomp/omp_lib.f90.in  2013-06-12 13:55:54.463119117 +0200
@@ -330,4 +330,10 @@
   end function omp_get_team_num
 end interface
 
+interface
+  function omp_is_initial_device ()
+logical (4) :: omp_is_initial_device
+  end function omp_is_initial_device
+end interface
+
   end module omp_lib
--- libgomp/omp.h.in.jj 2013-04-10 11:26:49.0 +0200
+++ libgomp/omp.h.in2013-06-12 13:57:10.357860035 +0200
@@ -118,6 +118,8 @@ extern int omp_get_num_devices (void) __
 extern int omp_get_num_teams (void) __GOMP_NOTHROW;
 extern int omp_get_team_num (void) __GOMP_NOTHROW;
 
+extern int omp_is_initial_device (void) __GOMP_NOTHROW;
+
 #ifdef __cplusplus
 }
 #endif
--- libgomp/libgomp.map.jj  2013-04-10 12:06:47.0 +0200
+++ libgomp/libgomp.map 2013-06-12 13:59:48.601251473 +0200
@@ -130,6 +130,8 @@ OMP_4.0 {
omp_get_num_teams_;
omp_get_team_num;
omp_get_team_num_;
+   omp_is_initial_device;
+   omp_is_initial_device_;
 } OMP_3.1;
 
 GOMP_1.0 {
--- libgomp/env.c.jj2013-04-05 17:08:01.0 +0200
+++ libgomp/env.c   2013-06-12 13:58:23.368662731 +0200
@@ -908,6 +908,12 @@ omp_get_team_num (void)
   return 0;
 }
 
+int
+omp_is_initial_device (void)
+{
+  return 1;
+}
+
 ialias (omp_set_dynamic)
 ialias (omp_set_nested)
 ialias (omp_set_num_threads)
@@ -926,3 +932,4 @@ ialias (omp_get_default_device)
 ialias (omp_get_num_devices)
 ialias (omp_get_num_teams)
 ialias (omp_get_team_num)
+ialias (omp_is_initial_device)
--- libgomp/omp_lib.h.in.jj 2013-04-10 13:02:18.0 +0200
+++ libgomp/omp_lib.h.in2013-06-12 13:56:23.717635386 +0200
@@ -92,3 +92,6 @@
   external omp_get_team_num
   integer(4) omp_get_default_device, omp_get_num_devices
   integer(4) omp_get_num_teams, omp_get_team_num
+
+  external omp_is_initial_device
+  logical(4) omp_is_initial_device

Jakub


[PATCH, trunk, PR57358] Avoid IPA-CP analysis if attribute optimize precludes it

2013-06-12 Thread Martin Jambor
Hi,

this is how I would like to fix the ICE when analyzing a function with
attribute optimize on trunk.  Inlining, the other user of the
analysis, is already smart enough not to analyze such functions, so
this teaches IPA-CP to do the same thing.  Consequently, functions
witrh attribute optimize(O0) or even optimize(fno-ipa-cp) will be
completely ignored by IPA-CP and friends, making it completely opaque
black box during the analysis.

It is a tiny bit more efficient than the simple fix for the branch
(but changes compiler behavior).  I assume it can also come handy
during some debugging/analysis.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2013-06-11  Martin Jambor  

PR tree-optimization/57358
* ipa-prop.c (ipa_func_spec_opts_forbid_analysis_p): New function.
(ipa_compute_jump_functions_for_edge): Bail out if it returns true.
(ipa_analyze_params_uses): Generate pessimistic info when true.

testsuite
* gcc.dg/ipa/pr57358.c: New test.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -78,6 +78,21 @@ struct ipa_cst_ref_desc
 
 static alloc_pool ipa_refdesc_pool;
 
+/* Return true if DECL_FUNCTION_SPECIFIC_OPTIMIZATION of the decl associated
+   with NODE should prevent us from analyzing it for the purposes of IPA-CP.  
*/
+
+static bool
+ipa_func_spec_opts_forbid_analysis_p (struct cgraph_node *node)
+{
+  tree fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (node->symbol.decl);
+  struct cl_optimization *os;
+
+  if (!fs_opts)
+return false;
+  os = TREE_OPTIMIZATION (fs_opts);
+  return !os->x_optimize || !os->x_flag_ipa_cp;
+}
+
 /* Return index of the formal whose tree is PTREE in function which corresponds
to INFO.  */
 
@@ -1446,6 +1461,9 @@ ipa_compute_jump_functions_for_edge (str
 return;
   vec_safe_grow_cleared (args->jump_functions, arg_num);
 
+  if (ipa_func_spec_opts_forbid_analysis_p (cs->caller))
+return;
+
   for (n = 0; n < arg_num; n++)
 {
   struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args, n);
@@ -1936,6 +1954,17 @@ ipa_analyze_params_uses (struct cgraph_n
   if (ipa_get_param_count (info) == 0 || info->uses_analysis_done)
 return;
 
+  info->uses_analysis_done = 1;
+  if (ipa_func_spec_opts_forbid_analysis_p (node))
+{
+  for (i = 0; i < ipa_get_param_count (info); i++)
+   {
+ ipa_set_param_used (info, i, true);
+ ipa_set_controlled_uses (info, i, IPA_UNDESCRIBED_USE);
+   }
+  return;
+}
+
   for (i = 0; i < ipa_get_param_count (info); i++)
 {
   tree parm = ipa_get_param (info, i);
@@ -1992,8 +2021,6 @@ ipa_analyze_params_uses (struct cgraph_n
   visit_ref_for_mod_analysis,
   visit_ref_for_mod_analysis);
 }
-
-  info->uses_analysis_done = 1;
 }
 
 /* Free stuff in PARMS_AINFO, assume there are PARAM_COUNT parameters.  */
Index: src/gcc/testsuite/gcc.dg/ipa/pr57358.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/pr57358.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct t { void (*func)(void*); };
+void test_func(struct t* a) __attribute__((optimize("O0")));
+void test_func(struct t* a)
+{
+  a->func(0);
+}


[PATCH] Limit LTO output block-size

2013-06-12 Thread Richard Biener

This limits the block-size granularity we use for increasing the
output buffer for LTO to 2MB.  Previously it grows exponentially
and unlimited.  I've increased the first block-size to 4096 bytes
from 1024 as well.  Any comments on the particular limit?

Not sure if we could even optimize away the buffer list in
favor of overcommitting memory using anon mmap (probably not
possible on 32-bit hosts).

LTO bootstrap / testing pending.

Thanks,
Richard.

2013-06-12  Richard Biener  

* lto-section-out.c (FIRST_BLOCK_SIZE): New define.
(MAX_BLOCK_SIZE): Likewise.
(lto_write_stream): Use them, cap maximum block-size at
MAX_BLOCK_SIZE.
(lto_append_block): Likewise.

Index: gcc/lto-section-out.c
===
*** gcc/lto-section-out.c   (revision 12)
--- gcc/lto-section-out.c   (working copy)
*** lto_end_section (void)
*** 152,164 
  }
  
  
  /* Write all of the chars in OBS to the assembler.  Recycle the blocks
 in obs as this is being done.  */
  
  void
  lto_write_stream (struct lto_output_stream *obs)
  {
!   unsigned int block_size = 1024;
struct lto_char_ptr_base *block;
struct lto_char_ptr_base *next_block;
if (!obs->first_block)
--- 152,170 
  }
  
  
+ /* We exponentially grow the size of the blocks as we need to make
+room for more data to be written.  Start with a single page and go up
+to 2MB pages for this.  */
+ #define FIRST_BLOCK_SIZE 4096
+ #define MAX_BLOCK_SIZE (2 * 1024 * 1024)
+ 
  /* Write all of the chars in OBS to the assembler.  Recycle the blocks
 in obs as this is being done.  */
  
  void
  lto_write_stream (struct lto_output_stream *obs)
  {
!   unsigned int block_size = FIRST_BLOCK_SIZE;
struct lto_char_ptr_base *block;
struct lto_char_ptr_base *next_block;
if (!obs->first_block)
*** lto_write_stream (struct lto_output_stre
*** 188,193 
--- 194,200 
else
lang_hooks.lto.append_data (base, num_chars, block);
block_size *= 2;
+   block_size = MIN (MAX_BLOCK_SIZE, block_size);
  }
  }
  
*** lto_append_block (struct lto_output_stre
*** 205,211 
  {
/* This is the first time the stream has been written
 into.  */
!   obs->block_size = 1024;
new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
obs->first_block = new_block;
  }
--- 212,218 
  {
/* This is the first time the stream has been written
 into.  */
!   obs->block_size = FIRST_BLOCK_SIZE;
new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
obs->first_block = new_block;
  }
*** lto_append_block (struct lto_output_stre
*** 215,220 
--- 222,228 
/* Get a new block that is twice as big as the last block
 and link it into the list.  */
obs->block_size *= 2;
+   obs->block_size = MIN (MAX_BLOCK_SIZE, obs->block_size);
new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
/* The first bytes of the block are reserved as a pointer to
 the next block.  Set the chain of the full block to the


[gomp4] Introduce thread_limit clause

2013-06-12 Thread Jakub Jelinek
Hi!

The num_threads clause on the #pragma omp teams construct has been replaced
with a new thread_limit clause.  Changed for C++ FE thusly:

2013-06-12  Jakub Jelinek  

* gimplify.c (gimplify_scan_omp_clauses): Handle
OMP_CLAUSE_THREAD_LIMIT.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* tree.c (omp_clause_num_ops, omp_clause_code_name): Add entries for
OMP_CLAUSE_THREAD_LIMIT.
* tree.h (enum omp_clause_code): Add OMP_CLAUSE_THREAD_LIMIT.
(OMP_CLAUSE_THREAD_LIMIT_EXPR): Define.
cp/
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_THREAD_LIMIT.
* parser.c (cp_parser_omp_clause_name): Handle thread_limit clause.
(cp_parser_omp_clause_thread_limit): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
(OMP_TEAMS_CLAUSE_MASK): Replace PRAGMA_OMP_CLAUSE_NUM_THREADS
with PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_THREAD_LIMIT.
c-family/
* c-pragma.h (enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.

--- gcc/gimplify.c.jj   2013-06-12 11:53:15.0 +0200
+++ gcc/gimplify.c  2013-06-12 12:16:51.123583143 +0200
@@ -6420,6 +6420,7 @@ gimplify_scan_omp_clauses (tree *list_p,
case OMP_CLAUSE_SCHEDULE:
case OMP_CLAUSE_NUM_THREADS:
case OMP_CLAUSE_NUM_TEAMS:
+   case OMP_CLAUSE_THREAD_LIMIT:
case OMP_CLAUSE_DIST_SCHEDULE:
case OMP_CLAUSE_DEVICE:
  if (gimplify_expr (&OMP_CLAUSE_OPERAND (c, 0), pre_p, NULL,
--- gcc/tree.c.jj   2013-06-12 11:53:15.0 +0200
+++ gcc/tree.c  2013-06-12 12:15:58.026440954 +0200
@@ -257,6 +257,7 @@ unsigned const char omp_clause_num_ops[]
   0, /* OMP_CLAUSE_INBRANCH  */
   0, /* OMP_CLAUSE_NOTINBRANCH  */
   1, /* OMP_CLAUSE_NUM_TEAMS  */
+  1, /* OMP_CLAUSE_THREAD_LIMIT  */
   0, /* OMP_CLAUSE_PROC_BIND  */
   1, /* OMP_CLAUSE_SAFELEN  */
   1, /* OMP_CLAUSE_SIMDLEN  */
@@ -298,6 +299,7 @@ const char * const omp_clause_code_name[
   "inbranch",
   "notinbranch",
   "num_teams",
+  "thread_limit",
   "proc_bind",
   "safelen",
   "simdlen",
@@ -11014,6 +11016,7 @@ walk_tree_1 (tree *tp, walk_tree_fn func
case OMP_CLAUSE_UNIFORM:
case OMP_CLAUSE_DEPEND:
case OMP_CLAUSE_NUM_TEAMS:
+   case OMP_CLAUSE_THREAD_LIMIT:
case OMP_CLAUSE_DEVICE:
case OMP_CLAUSE_DIST_SCHEDULE:
case OMP_CLAUSE_SAFELEN:
--- gcc/cp/semantics.c.jj   2013-06-04 20:55:56.0 +0200
+++ gcc/cp/semantics.c  2013-06-12 14:46:18.251419189 +0200
@@ -4779,6 +4779,25 @@ finish_omp_clauses (tree clauses)
}
  break;
 
+   case OMP_CLAUSE_THREAD_LIMIT:
+ t = OMP_CLAUSE_THREAD_LIMIT_EXPR (c);
+ if (t == error_mark_node)
+   remove = true;
+ else if (!type_dependent_expression_p (t)
+  && !INTEGRAL_TYPE_P (TREE_TYPE (t)))
+   {
+ error ("% expression must be integral");
+ remove = true;
+   }
+ else
+   {
+ t = mark_rvalue_use (t);
+ if (!processing_template_decl)
+   t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
+ OMP_CLAUSE_THREAD_LIMIT_EXPR (c) = t;
+   }
+ break;
+
case OMP_CLAUSE_DEVICE:
  t = OMP_CLAUSE_DEVICE_ID (c);
  if (t == error_mark_node)
--- gcc/cp/parser.c.jj  2013-06-04 20:55:56.0 +0200
+++ gcc/cp/parser.c 2013-06-12 12:22:01.239604269 +0200
@@ -26225,6 +26225,8 @@ cp_parser_omp_clause_name (cp_parser *pa
case 't':
  if (!strcmp ("taskgroup", p))
result = PRAGMA_OMP_CLAUSE_TASKGROUP;
+ else if (!strcmp ("thread_limit", p))
+   result = PRAGMA_OMP_CLAUSE_THREAD_LIMIT;
  else if (!strcmp ("to", p))
result = PRAGMA_OMP_CLAUSE_TO;
  break;
@@ -26892,6 +26900,36 @@ cp_parser_omp_clause_num_teams (cp_parse
 }
 
 /* OpenMP 4.0:
+   thread_limit ( expression ) */
+
+static tree
+cp_parser_omp_clause_thread_limit (cp_parser *parser, tree list,
+  location_t location)
+{
+  tree t, c;
+
+  if (!cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
+return list;
+
+  t = cp_parser_expression (parser, false, NULL);
+
+  if (t == error_mark_node
+  || !cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN))
+cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
+  /*or_comma=*/false,
+  /*consume_paren=*/true);
+
+  check_no_duplicate_clause (list, OMP_CLAUSE_THREAD_LIMIT,
+"thread_limit", location);
+
+  c = build_omp_clause (location, OMP_CLAUSE_THREAD_LIMIT);
+  OMP_CLAUSE_THREAD_LIMIT_EXPR (c) = t;
+  OMP_CLAUSE_CHAIN (c) = list;
+
+  return c;
+}
+
+/* OpenMP 4.0:
aligned ( variable-list )
aligned ( va

[Patch, Fortran] Print floating-point exception status after STOP/ERROR STOP

2013-06-12 Thread Tobias Burnus

David: Can you have a look at libgfortran/config/fpu-aix.h - Thanks!
Uros: Can you have a look at libgfortran/config/fpu-387.h - Thanks!


The attached patch causes gfortran-compiled programs to print warnings like

Note: The following floating-point status flag is signalling: 
IEEE_DIVIDE_BY_ZERO


when STOP / ERROR STOP is invoked. That's required by Fortran 2008 (8.4 
STOP and ERROR STOP statements):


"If any exception (14) is signaling on that image, the processor shall 
issue a warning indicating which exceptions are signaling; this warning 
shall be on the unit identified by the named constant ERROR UNIT 
(13.8.2.8)."



From the J3 discussion at 
http://mailman.j3-fortran.org/pipermail/j3/2013-June/006452.html
* sunf77 shows this message - and user complained - even if it didn't 
report inexact exceptions
* Intel: There is an option to dis-/enable this option (-assume 
[no]fpe_summary; default: no warning)
* NAG: Never reports inexact. Only underflow handling has a compiler 
option (as users complained; -no_underflow_warning
* PGI reports all (denorm, underflow, inexact) by default (and seemingly 
no compiler option exists)

* Cray: I couldn't find a compiler option to turn the warning on.

The patch below follows NAG by always printing the warning, but the 
underflow warning can be disabled. (It also always ignores denormalized 
status flags.)


One surely could extend it to allow to completely disable the warning - 
or to make it more fine grained like "none", "all" plus all single flags 
(including underflow, denormal and inexact, where by default one leaves 
out inexact).


Comments?


Test case:

real, volatile :: r
! Divide by zero:
r = 0
r = 1.0/r
! Underflow:
!r = tiny(r)
!r = r/100.
stop
end


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2013-06-12  Tobias Burnus  

	* gfortran.h (gfc_option_t): Add flag_underflow_warning.
	* gfortran.texi (_gfortran_set_options): Update.
	* invoke.texi (-funderflow-warning): Add doc.
	* lang.opt (fno-underflow-warning): Add flag.
	* options.c (gfc_init_options, gfc_handle_option): Handle it.
	* trans-decl.c (create_main_function): Update
	_gfortran_set_options call.

2013-06-12  Tobias Burnus  

	* libgfortran.h (compile_options_t) Add underflow_warning.
	(get_fpu_except_flags): New prototype.
	* runtime/compile_options.c (set_options, init_compile_options):
	Handle underflow_warning.
	* runtime/stop.c (report_exception): New function.
	(stop_numeric, stop_numeric_f08, stop_string, error_stop_string,
	error_stop_numeric): Call it.
	* config/fpu-387.h (get_fpu_except_flags): New function.
	* config/fpu-aix.h (get_fpu_except_flags): New function.
	* config/fpu-generic.h (get_fpu_except_flags): New function.
	* config/fpu-glibc.h (get_fpu_except_flags): New function.
	* config/fpu-glibc.h (get_fpu_except_flags): New function.
	* configure.ac: Check for fpxcp.h.
	* configure: Regenerate.
	* config.h.in: Regenerate.

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 14da0af..28b47ac 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2299,6 +2299,7 @@ typedef struct
   int flag_align_commons;
   int flag_protect_parens;
   int flag_realloc_lhs;
+  int flag_underflow_warning;
   int flag_aggressive_function_elimination;
   int flag_frontend_optimize;
 
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4a31a77..d8ed562 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -2855,13 +2855,16 @@ Default: enabled.
 are (bitwise or-ed): GFC_RTCHECK_BOUNDS (1), GFC_RTCHECK_ARRAY_TEMPS (2),
 GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (16), GFC_RTCHECK_POINTER (32).
 Default: disabled.
+@item @var{option}[7] @tab Unused.
+@item @var{option}[8] @tab Show a warning when invoking @code{STOP} and
+@code{ERROR STOP} if a floating-point underflow occurred.
 @end multitable
 
 @item @emph{Example}:
 @smallexample
-  /* Use gfortran 4.8 default options.  */
-  static int options[] = @{68, 511, 0, 0, 1, 1, 0@};
-  _gfortran_set_options (7, &options);
+  /* Use gfortran 4.9 default options.  */
+  static int options[] = @{68, 511, 0, 0, 1, 1, 0, 0, 1@};
+  _gfortran_set_options (9, &options);
 @end smallexample
 @end table
 
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 12c200e..63a1ffb 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -1157,8 +1157,17 @@ negative in the @code{SIGN} intrinsic.  @option{-fno-sign-zero} does not
 print the negative sign of zero values (or values rounded to zero for I/O)
 and regards zero as positive number in the @code{SIGN} intrinsic for
 compatibility with Fortran 77. The default is @option{-fsign-zero}.
+
+@item -fno-underflow-warning
+@opindex @code{funderflow-warning}
+When @code{STOP} and @code{ERROR STOP} is invoked, a warning is printed to
+@code{ERROR_UNIT} if a floating-point status flag is set (other than inexact).
+When @option{-fno-underflow-warning} is set, no warning is shown if a
+floating-poi

[gomp4] Small OpenMP 4.0 post-RC2 tweaks

2013-06-12 Thread Jakub Jelinek
Hi!

1) reference types in map/to/from clauses are supposed to map/copy what
those references refer to, so the reference var itself doesn't need to be
addressable.  We'll need to remap the reference variable to a new reference
that will refer to the corresponding device object.
2) the spec now allows more than one to/from clauses, and the clauses don't
need to be first, restriction is that there must be at least one to/from
clause on #pragma omp target update
3) [ expression ] syntax is now freely interchangeable with
[ expression[opt] : expression[opt] ] syntax, a[3] in the clauses is
considered array element, while a[:][3][1:8] array section, but as both
array elements and array sections are allowed at the same spots, it is
easiest to parse [ expression ] the same as [ expression : 1 ].

More tweaks to follow.

2013-06-12  Jakub Jelinek  

* semantics.c (finish_omp_clause): Don't mark references addressable.
For OMP_CLAUSE_{TO,FROM} detect same decl appearing more than once
in motion clauses.
* parser.c (cp_parser_omp_var_list_no_open): Handle [ expression ]
notation in array section specification.
(cp_parser_omp_all_clauses): Don't require to/from clauses to be
first.
(cp_parser_omp_target_update): Adjust diagnostics.

--- gcc/cp/semantics.c.jj   2013-06-04 20:55:56.0 +0200
+++ gcc/cp/semantics.c  2013-06-12 14:46:18.251419189 +0200
@@ -4925,8 +4944,18 @@ finish_omp_clauses (tree clauses)
  remove = true;
}
  else if (!processing_template_decl
+  && TREE_CODE (TREE_TYPE (t)) != REFERENCE_TYPE
   && !cxx_mark_addressable (t))
remove = true;
+ else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
+   break;
+ else if (bitmap_bit_p (&generic_head, DECL_UID (t)))
+   {
+ error ("%qD appears more than once in motion clauses", t);
+ remove = true;
+   }
+ else
+   bitmap_set_bit (&generic_head, DECL_UID (t));
  break;
 
case OMP_CLAUSE_UNIFORM:
--- gcc/cp/parser.c.jj  2013-06-04 20:55:56.0 +0200
+++ gcc/cp/parser.c 2013-06-12 12:22:01.239604269 +0200
@@ -26330,13 +26332,19 @@ cp_parser_omp_var_list_no_open (cp_parse
  if (!colon)
parser->colon_corrects_to_scope_p
  = saved_colon_corrects_to_scope_p;
- /* Look for `:'.  */
- if (!cp_parser_require (parser, CPP_COLON, RT_COLON))
-   goto skip_comma;
- if (!cp_lexer_next_token_is (parser->lexer,
-  CPP_CLOSE_SQUARE))
-   length = cp_parser_expression (parser, /*cast_p=*/false,
-  NULL);
+ if (cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_SQUARE))
+   length = integer_one_node;
+ else
+   {
+ /* Look for `:'.  */
+ if (!cp_parser_require (parser, CPP_COLON, RT_COLON))
+   goto skip_comma;
+ if (!cp_lexer_next_token_is (parser->lexer,
+  CPP_CLOSE_SQUARE))
+   length = cp_parser_expression (parser,
+  /*cast_p=*/false,
+  NULL);
+   }
  /* Look for the closing `]'.  */
  if (!cp_parser_require (parser, CPP_CLOSE_SQUARE,
  RT_CLOSE_SQUARE))
@@ -27409,15 +27447,11 @@ cp_parser_omp_all_clauses (cp_parser *pa
  clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_TO,
clauses);
  c_name = "to";
- if (!first)
-   goto clause_not_first;
  break;
case PRAGMA_OMP_CLAUSE_FROM:
  clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_FROM,
clauses);
  c_name = "from";
- if (!first)
-   goto clause_not_first;
  break;
case PRAGMA_OMP_CLAUSE_UNIFORM:
  clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_UNIFORM,
@@ -29128,7 +29167,7 @@ cp_parser_omp_target_update (cp_parser *
   && find_omp_clause (clauses, OMP_CLAUSE_FROM) == NULL_TREE)
 {
   error_at (pragma_tok->location,
-   "%<#pragma omp target update must contain either "
+   "%<#pragma omp target update must contain at least one "
"% or % clauses");
   return false;
 }

Jakub


Re: *PING* / Re: [Patch, Fortran] Finalize nonallocatables with INTENT(out)

2013-06-12 Thread Tobias Burnus

Thanks Dominique and Andreas for reporting this issue.

Dominique Dhumieres wrote:

The test gfortran.dg/finalize_10.f90 fails in 32 bit mode [...]
The following patch fixes it
[...]
  
I have tried to weaken the test by not using any target and using a regexp

of the kind "(int|long)", but I did not succeeded.


Seemingly, dg-tree-dump-times does not work with regular expressions. I 
have replaces it by dg-tree-dump + regular expression.


Committed as 23.

Tobias
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 22)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,8 @@
+2013-06-12  Tobias Burnus  
+	Dominique d'Humieres  
+
+	* gfortran.dg/finalize_10.f90: Update dg-dump.
+
 2013-06-12  Jakub Jelinek  
 
 	PR target/56564
Index: gcc/testsuite/gfortran.dg/finalize_10.f90
===
--- gcc/testsuite/gfortran.dg/finalize_10.f90	(Revision 22)
+++ gcc/testsuite/gfortran.dg/finalize_10.f90	(Arbeitskopie)
@@ -26,7 +26,7 @@
 
 ! Finalize CLASS + set default init
 ! { dg-final { scan-tree-dump-times "y->_vptr->_final \\(&desc.\[0-9\]+, y->_vptr->_size, 0\\);" 1 "original" } }
-! { dg-final { scan-tree-dump-times "__builtin_memcpy \\(\\(void .\\) y->_data, \\(void .\\) y->_vptr->_def_init, \\(unsigned long\\) y->_vptr->_size\\);" 1 "original" } }
+! { dg-final { scan-tree-dump   "__builtin_memcpy \\(\\(void .\\) y->_data, \\(void .\\) y->_vptr->_def_init, \\(unsigned (long|int)\\) y->_vptr->_size\\);" "original" } }
 ! { dg-final { scan-tree-dump-times "x->_vptr->_final \\(&x->_data, x->_vptr->_size, 0\\);" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "x->_vptr->_copy \\(x->_vptr->_def_init, &x->_data\\);" 1 "original" } }
 


Re: *PING* / Re: [Patch, Fortran] Finalize nonallocatables with INTENT(out)

2013-06-12 Thread Tobias Burnus

Tobias Burnus wrote:

Dominique Dhumieres wrote:
I have tried to weaken the test by not using any target and using a 
regexp

of the kind "(int|long)", but I did not succeeded.


Ups, I missed that Dominique's and Andreas' 32bit dumps are different 
("unsigned int" vs. "character(kind=4)"). Thus, the new pattern accepts 
either version. Committed as 26.


Tobias
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 23)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,7 +1,11 @@
 2013-06-12  Tobias Burnus  
+
+	* gfortran.dg/finalize_10.f90: Update scan-tree-dump.
+
+2013-06-12  Tobias Burnus  
 	Dominique d'Humieres  
 
-	* gfortran.dg/finalize_10.f90: Update dg-dump.
+	* gfortran.dg/finalize_10.f90: Update scan-tree-dump.
 
 2013-06-12  Jakub Jelinek  
 
Index: gcc/testsuite/gfortran.dg/finalize_10.f90
===
--- gcc/testsuite/gfortran.dg/finalize_10.f90	(Revision 23)
+++ gcc/testsuite/gfortran.dg/finalize_10.f90	(Arbeitskopie)
@@ -26,7 +26,7 @@
 
 ! Finalize CLASS + set default init
 ! { dg-final { scan-tree-dump-times "y->_vptr->_final \\(&desc.\[0-9\]+, y->_vptr->_size, 0\\);" 1 "original" } }
-! { dg-final { scan-tree-dump   "__builtin_memcpy \\(\\(void .\\) y->_data, \\(void .\\) y->_vptr->_def_init, \\(unsigned (long|int)\\) y->_vptr->_size\\);" "original" } }
+! { dg-final { scan-tree-dump   "__builtin_memcpy \\(\\(void .\\) y->_data, \\(void .\\) y->_vptr->_def_init, \\((unsigned long|unsigned int|character\\(kind=4\\))\\) y->_vptr->_size\\);" "original" } }
 ! { dg-final { scan-tree-dump-times "x->_vptr->_final \\(&x->_data, x->_vptr->_size, 0\\);" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "x->_vptr->_copy \\(x->_vptr->_def_init, &x->_data\\);" 1 "original" } }
 


Re: [C++ Patch] PR 42021

2013-06-12 Thread Jason Merrill

OK, but please add a comment.

Jason


Re: [RFC] Implement Undefined Behavior Sanitizer (take 2)

2013-06-12 Thread Marek Polacek
On Tue, Jun 11, 2013 at 10:44:12PM +0200, Jakub Jelinek wrote:
> There is another thing to solve BTW, op0 and/or op1 might have side-effects,
> if you are going to evaluate them more than once, they need to be surrounded
> into cp_save_expr resp. c_save_expr.

There's that unpleasant thing that cp_save_expr is declared in
cp/cp-tree.h, but we don't want to include cp/*.h or c/*.h files
in c-family/c-ubsan.c.  Should I use save_expr from tree.c instead?
I seem to recall that that isn't the best thing to do...

Marek


Re: [RFC] Implement Undefined Behavior Sanitizer (take 2)

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 03:48:24PM +0200, Marek Polacek wrote:
> On Tue, Jun 11, 2013 at 10:44:12PM +0200, Jakub Jelinek wrote:
> > There is another thing to solve BTW, op0 and/or op1 might have side-effects,
> > if you are going to evaluate them more than once, they need to be surrounded
> > into cp_save_expr resp. c_save_expr.
> 
> There's that unpleasant thing that cp_save_expr is declared in
> cp/cp-tree.h, but we don't want to include cp/*.h or c/*.h files
> in c-family/c-ubsan.c.  Should I use save_expr from tree.c instead?
> I seem to recall that that isn't the best thing to do...

No, you really need to use the cp_save_expr/c_save_expr, especially for
C it e.g. fully folds etc.  You want to call that in
cp_build_binary_op etc., also because you want both the instrument_expr
itself, but also the original binary expression to use the SAVE_EXPRs if
they are created.

Jakub


Re: SPE detection broken on Linux (bits/predefs.h: No such file or directory)

2013-06-12 Thread Matthias Klose
Am 12.06.2013 13:05, schrieb Olivier Hainque:
> 
> On Jun 11, 2013, at 16:50 , David Edelsohn  wrote:
>>> I solved this in gcc/config/rs6000/t-linux by replacing the line
>>>
>>> MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring
>>> rs6000/e500-double.h, $(tm_file_list)),,v1)
>>>
>>> with
>>>
>>> MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring
>>> 8548,$(with_cpu)),,v1)
>>
>> Olivier was the person who removed e500-double.h and added 8548
>> support.  I would like to hear his and Eric's comment since they seem
>> to be doing the most work on e500 at the moment.
> 
>  The suggested update is in line with this part of the
>  change we did at the time:
[...]
>  so looks correct to me. Thanks!

committed the following patch on behalf of Roland to trunk.

  Matthias

2013-06-12  Roland Stigge 

* config/rs6000/t-linux (MULTIARCH_DIRNAME): Fix SPE version detection.

Index: config/rs6000/t-linux
===
--- config/rs6000/t-linux   (revision 28)
+++ config/rs6000/t-linux   (working copy)
@@ -2,7 +2,7 @@
 # or soft-float.
 ifeq (,$(filter $(with_cpu),$(SOFT_FLOAT_CPUS))$(findstring 
soft,$(with_float)))
 ifneq (,$(findstring spe,$(target)))
-MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring 
rs6000/e500-double.h, $(tm_file_list)),,v1)
+MULTIARCH_DIRNAME = powerpc-linux-gnuspe$(if $(findstring 
8548,$(with_cpu)),,v1)
 else
 MULTIARCH_DIRNAME = powerpc-linux-gnu
 endif


Re: [Patch, Fortran] Print floating-point exception status after STOP/ERROR STOP

2013-06-12 Thread Uros Bizjak
On Wed, Jun 12, 2013 at 3:05 PM, Tobias Burnus  wrote:
> David: Can you have a look at libgfortran/config/fpu-aix.h - Thanks!
> Uros: Can you have a look at libgfortran/config/fpu-387.h - Thanks!
>

+  unsigned short cw;
+
+  __asm__ ("fnstsw %0" : "=a" (cw));

__asm__ __volatile__ ("fnstsw\t%0" : "=a" (cw));

fnstsw uses processor state (x87 status word) that is hidden to gcc,
so it needs to be __volatile__.

+  if (has_sse())
+{
+  unsigned int cw_sse;
+  __asm__ ("stmxcsr %0" : "=m" (*&cw_sse));

also __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse));

%v will conditionally emit "v" prefix for TARGET_AVX.

+  cw |= cw_sse;
+}

Looks OK otherwise.

Thanks,
Uros.


Re: [C++ Patch] PR 42021

2013-06-12 Thread Paolo Carlini

On 06/12/2013 03:38 PM, Jason Merrill wrote:

OK, but please add a comment.

Thanks. I added this:

  // cp_parser_lookup_name has the same diagnostic,
  // thus make sure to emit it almost once.

Paolo.


[patch] set MULTIARCH_DIRNAME for multilib architectures

2013-06-12 Thread Matthias Klose
[CCing port maintainers]

Currently the MULTIARCH_DIRNAME is not correctly set for the x86_64-kfreebsd-gnu
target, and is not set at all for architectures which do have multilib
configurations by default.  This patch makes sure that MULTIARCH_DIRNAME is
always set to the default multilib configuration for these multilib targets.

I am using this macro in a local patch which installs the host specific C++
headers to /usr/include/$(MULTIARCH_DIRNAME)/c++/4.x.y instead of
$(gcc_gxx_include_dir)/$(target_noncanonical).

Ok for the trunk?

  Matthias
2013-06-12  Matthias Klose  

* config/i386/t-linux64: Set MULTIARCH_DIRNAME.
* config/i386/t-kfreebsd: Set MULTIARCH_DIRNAME.
* config.gcc (i[34567]86-*-linux* | x86_64-*-linux*): Prepend
i386/t-linux to $tmake_file.
* config/mips/t-linux64: Set MULTIARCH_DIRNAME.
* config/rs6000/t-linux64: Set MULTIARCH_DIRNAME.
* config/s390/t-linux64: Set MULTIARCH_DIRNAME.
* config/sparc/t-linux64: Set MULTIARCH_DIRNAME.

Index: config/sparc/t-linux64
===
--- config/sparc/t-linux64  (revision 200012)
+++ config/sparc/t-linux64  (working copy)
@@ -27,3 +27,5 @@
 MULTILIB_DIRNAMES = 64 32
 MULTILIB_OSDIRNAMES = ../lib64$(call if_multiarch,:sparc64-linux-gnu)
 MULTILIB_OSDIRNAMES += $(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:sparc-linux-gnu)
+
+MULTIARCH_DIRNAME = $(call if_multiarch,sparc$(if $(findstring 
64,$(target)),64)-linux-gnu)
Index: config/mips/t-linux64
===
--- config/mips/t-linux64   (revision 200012)
+++ config/mips/t-linux64   (working copy)
@@ -24,3 +24,13 @@
../lib32$(call 
if_multiarch,:mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT)) \
../lib$(call if_multiarch,:mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT)) \
../lib64$(call 
if_multiarch,:mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT))
+
+ifneq (,$(findstring abin32,$(target)))
+MULTIARCH_DIRNAME = $(call 
if_multiarch,mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT))
+else
+ifneq (,$(findstring abi64,$(target)))
+MULTIARCH_DIRNAME = $(call 
if_multiarch,mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT))
+else
+MULTIARCH_DIRNAME = $(call if_multiarch,mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT))
+endif
+endif
Index: config/rs6000/t-linux64
===
--- config/rs6000/t-linux64 (revision 200012)
+++ config/rs6000/t-linux64 (working copy)
@@ -30,3 +30,5 @@
 MULTILIB_EXTRA_OPTS = fPIC
 MULTILIB_OSDIRNAMES= ../lib64$(call if_multiarch,:powerpc64-linux-gnu)
 MULTILIB_OSDIRNAMES+= $(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:powerpc-linux-gnu)
+
+MULTIARCH_DIRNAME = $(call if_multiarch,powerpc$(if $(findstring 
64,$(target)),64)-linux-gnu)
Index: config/s390/t-linux64
===
--- config/s390/t-linux64   (revision 200012)
+++ config/s390/t-linux64   (working copy)
@@ -9,3 +9,5 @@
 MULTILIB_DIRNAMES = 64 32
 MULTILIB_OSDIRNAMES = ../lib64$(call if_multiarch,:s390x-linux-gnu)
 MULTILIB_OSDIRNAMES += $(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:s390-linux-gnu)
+
+MULTIARCH_DIRNAME = $(call if_multiarch,s390$(if $(findstring 
s390x,$(target)),x)-linux-gnu)
Index: config/i386/t-kfreebsd
===
--- config/i386/t-kfreebsd  (revision 200012)
+++ config/i386/t-kfreebsd  (working copy)
@@ -1,5 +1,9 @@
-MULTIARCH_DIRNAME = $(call if_multiarch,i386-kfreebsd-gnu)
+ifeq (,$(MULTIARCH_DIRNAME))
+  MULTIARCH_DIRNAME = $(call if_multiarch,i386-kfreebsd-gnu)
+endif
 
 # MULTILIB_OSDIRNAMES are set in t-linux64.
 KFREEBSD_OS = $(filter kfreebsd%, $(word 3, $(subst -, ,$(target
 MULTILIB_OSDIRNAMES := $(filter-out mx32=%,$(subst 
linux,$(KFREEBSD_OS),$(MULTILIB_OSDIRNAMES)))
+
+MULTIARCH_DIRNAME := $(subst linux,$(KFREEBSD_OS),$(MULTIARCH_DIRNAME))
Index: config/i386/t-linux64
===
--- config/i386/t-linux64   (revision 200012)
+++ config/i386/t-linux64   (working copy)
@@ -36,3 +36,13 @@
 MULTILIB_OSDIRNAMES = m64=../lib64$(call if_multiarch,:x86_64-linux-gnu)
 MULTILIB_OSDIRNAMES+= m32=$(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:i386-linux-gnu)
 MULTILIB_OSDIRNAMES+= mx32=../libx32$(call if_multiarch,:x86_64-linux-gnux32)
+
+ifneq (,$(findstring x86_64,$(target)))
+  ifneq (,$(findstring biarchx32.h,$(tm_include_list)))
+  MULTIARCH_DIRNAME = $(call if_multiarch,x86_64-linux-gnux32)
+  else
+  MULTIARCH_DIRNAME = $(call if_multiarch,x86_64-linux-gnu)
+  endif
+else
+  MULTIARCH_DIRNAME = $(call if_multiarch,i38

Re: [C++ Patch] PR 42021

2013-06-12 Thread Paolo Carlini

On 06/12/2013 04:05 PM, Paolo Carlini wrote:

On 06/12/2013 03:38 PM, Jason Merrill wrote:

OK, but please add a comment.

Thanks. I added this:

  // cp_parser_lookup_name has the same diagnostic,
  // thus make sure to emit it almost once.

As mentioned by Marc off-line, 'at most' is definitely better ;)

Paolo.


Fix verifier for duplicated decls during sreaming

2013-06-12 Thread Jan Hubicka
Hi,
verifier needs update to tolerate duplicated nodes for a decl during LTO 
streaming
state.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* cgraph.c (verify_edge_corresponds_to_fndecl): Be lax about
decl has when in streaming stage.
* lto-symtab.c (lto_symtab_merge_symbols): Likewise.
* cgraph.h (cgraph_state): Add CGRAPH_LTO_STREAMING.

* lto.c (read_cgraph_and_symbols): Set cgraph into streaming state.

Index: cgraph.c
===
--- cgraph.c(revision 11)
+++ cgraph.c(working copy)
@@ -2291,6 +2291,8 @@ verify_edge_corresponds_to_fndecl (struc
 
   if (!decl || e->callee->global.inlined_to)
 return false;
+  if (cgraph_state == CGRAPH_LTO_STREAMING)
+return false;
   node = cgraph_get_node (decl);
 
   /* We do not know if a node from a different partition is an alias or what it
Index: cgraph.h
===
--- cgraph.h(revision 11)
+++ cgraph.h(working copy)
@@ -551,6 +551,8 @@ enum cgraph_state
   CGRAPH_STATE_PARSING,
   /* Callgraph is being constructed.  It is safe to add new functions.  */
   CGRAPH_STATE_CONSTRUCTION,
+  /* Callgraph is being at LTO time.  */
+  CGRAPH_LTO_STREAMING,
   /* Callgraph is built and IPA passes are being run.  */
   CGRAPH_STATE_IPA,
   /* Callgraph is built and all functions are transformed to SSA form.  */
Index: lto/lto.c
===
--- lto/lto.c   (revision 11)
+++ lto/lto.c   (working copy)
@@ -2891,6 +2891,7 @@ read_cgraph_and_symbols (unsigned nfiles
   /* True, since the plugin splits the archives.  */
   gcc_assert (num_objects == nfiles);
 }
+  cgraph_state = CGRAPH_LTO_STREAMING;
 
   tree_with_vars = htab_create_ggc (101, htab_hash_pointer, htab_eq_pointer,
NULL);
Index: lto-symtab.c
===
--- lto-symtab.c(revision 11)
+++ lto-symtab.c(working copy)
@@ -587,7 +587,7 @@ lto_symtab_merge_symbols (void)
 also re-populate the hash translating decls into symtab nodes*/
   FOR_EACH_SYMBOL (node)
{
- cgraph_node *cnode;
+ cgraph_node *cnode, *cnode2;
  if (!node->symbol.analyzed && node->symbol.alias_target)
{
  symtab_node tgt = symtab_node_for_asm (node->symbol.alias_target);
@@ -596,10 +596,17 @@ lto_symtab_merge_symbols (void)
symtab_resolve_alias (node, tgt);
}
  node->symbol.aux = NULL;
+ 
  if (!(cnode = dyn_cast  (node))
  || !cnode->clone_of
  || cnode->clone_of->symbol.decl != cnode->symbol.decl)
-   symtab_insert_node_to_hashtable ((symtab_node)node);
+   {
+ if (cnode && DECL_BUILT_IN (node->symbol.decl)
+ && (cnode2 = cgraph_get_node (node->symbol.decl))
+ && cnode2 != cnode)
+   lto_cgraph_replace_node (cnode2, cnode);
+ symtab_insert_node_to_hashtable ((symtab_node)node);
+   }
}
 }
 }
Index: symtab.c
===
--- symtab.c(revision 11)
+++ symtab.c(working copy)
@@ -647,11 +647,14 @@ verify_symtab_base (symtab_node node)
   error_found = true;
 }

-  hashed_node = symtab_get_node (node->symbol.decl);
-  if (!hashed_node)
+  if (cgraph_state != CGRAPH_LTO_STREAMING)
 {
-  error ("node not found in symtab decl hashtable");
-  error_found = true;
+  hashed_node = symtab_get_node (node->symbol.decl);
+  if (!hashed_node)
+   {
+ error ("node not found in symtab decl hashtable");
+ error_found = true;
+   }
 }
   if (assembler_name_hash)
 {


[Patch ARM] Fix some testsuite fallout with DATA_ABI_ALIGNMENT changes.

2013-06-12 Thread Ramana Radhakrishnan

Hi,

This fixes up some of the fallout in the testsuite for ARM with the 
DATA_ABI_ALIGNMENT changes recently.


Applied to trunk.

regards
Ramana


2013-06-12  Ramana Radhakrishnan  

* gcc.target/arm/unaligned-memcpy-4.c (src, dst): Initialize
to ensure alignment.
* gcc.target/arm/unaligned-memcpy-3.c (src): Likewise.diff --git a/gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c 
b/gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c
index 9e2d164..d0b09bd 100644
--- a/gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c
+++ b/gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c
@@ -4,7 +4,7 @@
 
 #include 
 
-char src[16];
+char src[16] = {0};
 
 void aligned_src (char *dest)
 {
diff --git a/gcc/testsuite/gcc.target/arm/unaligned-memcpy-4.c 
b/gcc/testsuite/gcc.target/arm/unaligned-memcpy-4.c
index 4708c51..830e22e 100644
--- a/gcc/testsuite/gcc.target/arm/unaligned-memcpy-4.c
+++ b/gcc/testsuite/gcc.target/arm/unaligned-memcpy-4.c
@@ -4,8 +4,8 @@
 
 #include 
 
-char src[16];
-char dest[16];
+char src[16] = { 0 };
+char dest[16] = { 0 };
 
 void aligned_both (void)
 {

[RS6000] IBM long double little-endian

2013-06-12 Thread Alan Modra
FLOAT_WORDS_BIG_ENDIAN doesn't work out too well for IBM extended
double when little-endian, because we're thinking to keep the large
magnitude double first.  See the comment below on
LONG_DOUBLE_LARGE_FIRST.

This patch fixes all occurrences of FLOAT_WORDS_BIG_ENDIAN in the
rs6000 backend (all of them are dealing with long doubles), and adds
an expander that results in us avoiding all current code in builtins.c
and optabs.c that uses FLOAT_WORDS_BIG_ENDIAN.  Bootstrapped etc.
powerpc64-linux.  signbittf2 is written to use the 64-bit shift when
available as this lets optimisers know the state of the high 32-bits,
and avoid a sign/zero extend if the SImode result is later extended to
DImode.

* config/rs6000/rs6000.h (LONG_DOUBLE_LARGE_FIRST): Define.
* config/rs6000/rs6000.md (signbittf2): New insn.
(extenddftf2_internal): Use LONG_DOUBLE_LARGE_FIRST.
(abstf2_internal, cmptf_internal2): Likewise.
* config/rs6000/spe.md (spe_abstf2_cmp, spe_abstf2_tst): Likewise.

Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 199948)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -715,6 +715,11 @@ extern unsigned char rs6000_recip_bits[];
instructions for them.  Might as well be consistent with bits and bytes.  */
 #define WORDS_BIG_ENDIAN 1
 
+/* This says that for the IBM long double the larger magnitude double
+   comes first.  It's really a two element double array, and arrays
+   don't index differently between little- and big-endian.  */
+#define LONG_DOUBLE_LARGE_FIRST 1
+
 #define MAX_BITS_PER_WORD 64
 
 /* Width of a word, in units (bytes).  */
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 199948)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -5178,6 +5178,41 @@
   "frsqrtes %0,%1"
   [(set_attr "type" "fp")])
 
+;; This expander is here to avoid FLOAT_WORDS_BIGENDIAN tests in
+;; builtins.c and optabs.c that are not correct for IBM long double
+;; when little-endian.
+(define_expand "signbittf2"
+  [(set (match_dup 2)
+   (float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "")))
+   (set (match_dup 3)
+   (subreg:DI (match_dup 2) 0))
+   (set (match_dup 4)
+   (match_dup 5))
+   (set (match_operand:SI 0 "gpc_reg_operand" "")
+   (match_dup 6))]
+  "!TARGET_IEEEQUAD
+   && TARGET_HARD_FLOAT
+   && (TARGET_FPRS || TARGET_E500_DOUBLE)
+   && TARGET_LONG_DOUBLE_128"
+{
+  operands[2] = gen_reg_rtx (DFmode);
+  operands[3] = gen_reg_rtx (DImode);
+  if (TARGET_POWERPC64)
+{
+  operands[4] = gen_reg_rtx (DImode);
+  operands[5] = gen_rtx_LSHIFTRT (DImode, operands[3], GEN_INT (63));
+  operands[6] = gen_rtx_SUBREG (SImode, operands[4],
+   WORDS_BIG_ENDIAN ? 4 : 0);
+}
+  else
+{
+  operands[4] = gen_reg_rtx (SImode);
+  operands[5] = gen_rtx_SUBREG (SImode, operands[3],
+   WORDS_BIG_ENDIAN ? 0 : 4);
+  operands[6] = gen_rtx_LSHIFTRT (SImode, operands[4], GEN_INT (31));
+}
+})
+
 (define_expand "copysign3"
   [(set (match_dup 3)
 (abs:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "")))
@@ -9260,8 +9295,8 @@
   "&& reload_completed"
   [(pc)]
 {
-  const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
-  const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
+  const int lo_word = LONG_DOUBLE_LARGE_FIRST ? GET_MODE_SIZE (DFmode) : 0;
+  const int hi_word = LONG_DOUBLE_LARGE_FIRST ? 0 : GET_MODE_SIZE (DFmode);
   emit_move_insn (simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word),
  operands[1]);
   emit_move_insn (simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word),
@@ -9490,8 +9525,8 @@
&& TARGET_LONG_DOUBLE_128"
   "
 {
-  const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
-  const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
+  const int hi_word = LONG_DOUBLE_LARGE_FIRST ? 0 : GET_MODE_SIZE (DFmode);
+  const int lo_word = LONG_DOUBLE_LARGE_FIRST ? GET_MODE_SIZE (DFmode) : 0;
   operands[3] = gen_reg_rtx (DFmode);
   operands[4] = gen_reg_rtx (CCFPmode);
   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
@@ -12879,8 +12914,8 @@
(match_dup 13)]
 {
   REAL_VALUE_TYPE rv;
-  const int lo_word = FLOAT_WORDS_BIG_ENDIAN ? GET_MODE_SIZE (DFmode) : 0;
-  const int hi_word = FLOAT_WORDS_BIG_ENDIAN ? 0 : GET_MODE_SIZE (DFmode);
+  const int lo_word = LONG_DOUBLE_LARGE_FIRST ? GET_MODE_SIZE (DFmode) : 0;
+  const int hi_word = LONG_DOUBLE_LARGE_FIRST ? 0 : GET_MODE_SIZE (DFmode);
 
   operands[5] = simplify_gen_subreg (DFmode, operands[1], TFmode, hi_word);
   operands[6] = simplify_gen_subreg (DFmode, operands[1], TFmode, lo_word);
Index: gcc/config/rs6000/spe.md
=

Re: [RS6000] IBM long double little-endian

2013-06-12 Thread David Edelsohn
On Wed, Jun 12, 2013 at 10:53 AM, Alan Modra  wrote:

> * config/rs6000/rs6000.h (LONG_DOUBLE_LARGE_FIRST): Define.
> * config/rs6000/rs6000.md (signbittf2): New insn.
> (extenddftf2_internal): Use LONG_DOUBLE_LARGE_FIRST.
> (abstf2_internal, cmptf_internal2): Likewise.
> * config/rs6000/spe.md (spe_abstf2_cmp, spe_abstf2_tst): Likewise.


Why create and define LONG_DOUBLE_LARGE_FIRST if it always is "1"
(True)?  To retain flexibility?

Thanks, David


RFC [MIPS, RS6000] Mangling of IBM long double template literals

2013-06-12 Thread Alan Modra
As noted in the comment below, IBM long double is really an array of
two doubles.  In little-endian mode this means the words of each
double should be reversed in write_real_cst, but the large magnitude
double remains the first element of the array.

This patch specially treats IBM long double so that mangling for
template literal args of a given long double value is the same for
both little and big endian.  Bootstrapped etc. powerpc64-linux.

This is of course an ABI change for any existing little-endian users
of IBM long double literals in templates.  On powerpc, I think we can
safely say there are no such users.  However it does look like MIPS
also uses a variant of IBM long double, and I'm less certain there.

* mangle.c (write_real_cst): Specially treat IBM long double.

Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c (revision 199975)
+++ gcc/cp/mangle.c (working copy)
@@ -1591,28 +1591,35 @@ write_real_cst (const tree value)
 {
   if (abi_version_at_least (2))
 {
+  const struct real_format *fmt;
   long target_real[4];  /* largest supported float */
   char buffer[9];   /* eight hex digits in a 32-bit number */
-  int i, limit, dir;
+  int i, limit, dir, twid;
 
   tree type = TREE_TYPE (value);
   int words = GET_MODE_BITSIZE (TYPE_MODE (type)) / 32;
 
-  real_to_target (target_real, &TREE_REAL_CST (value),
- TYPE_MODE (type));
+  fmt = REAL_MODE_FORMAT (TYPE_MODE (type));
+  real_to_target_fmt (target_real, &TREE_REAL_CST (value), fmt);
 
   /* The value in target_real is in the target word order,
 so we must write it out backward if that happens to be
 little-endian.  write_number cannot be used, it will
 produce uppercase.  */
   if (FLOAT_WORDS_BIG_ENDIAN)
-   i = 0, limit = words, dir = 1;
+   i = 0, limit = words, dir = 1, twid = 0;
+  else if (fmt->pnan < fmt->p)
+   /* This is an IBM extended double format made up of two IEEE
+  doubles.  When little-endian, the doubles are in
+  little-endian word order, but the array order stays the
+  same.  */
+   i = 0, limit = words, dir = 1, twid = 1;
   else
-   i = words - 1, limit = -1, dir = -1;
+   i = words - 1, limit = -1, dir = -1, twid = 0;
 
   for (; i != limit; i += dir)
{
- sprintf (buffer, "%08lx", (unsigned long) target_real[i]);
+ sprintf (buffer, "%08lx", (unsigned long) target_real[i ^ twid]);
  write_chars (buffer, 8);
}
 }

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RFC] Implement Undefined Behavior Sanitizer (take 2)

2013-06-12 Thread Marek Polacek
On Wed, Jun 12, 2013 at 03:52:08PM +0200, Jakub Jelinek wrote:
> No, you really need to use the cp_save_expr/c_save_expr, especially for
> C it e.g. fully folds etc.  You want to call that in
> cp_build_binary_op etc., also because you want both the instrument_expr
> itself, but also the original binary expression to use the SAVE_EXPRs if
> they are created.

I see.  Here's somewhat tweaked version; it uses
c_save_expr/cp_save_expr + contains a few fixes suggested by Marc.
How does it look now?  Jason, does the cp/typeck.c part look sane?
Thanks.

2013-06-12  Marek Polacek  

* Makefile.in: Add ubsan.c.
* common.opt: Add -fsanitize=undefined option.
* doc/invoke.texi: Document the new flag.
* sanitizer.def (DEF_SANITIZER_BUILTIN): Define.
* builtin-attrs.def (ATTR_COLD): Define.
* asan.c (initialize_sanitizer_builtins): Build
BT_FN_VOID_PTR_PTR_PTR.
* builtins.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS): Define.

c-family/
* c-ubsan.c: New file.
* c-ubsan.h: New file.

cp/
* typeck.c (cp_build_binary_op): Add division by zero and shift
instrumentation.

c/
* c-typeck.c (build_binary_op): Add division by zero and shift
instrumentation.

--- gcc/c-family/c-ubsan.c.mp   2013-06-11 19:51:55.555492466 +0200
+++ gcc/c-family/c-ubsan.c  2013-06-12 17:05:20.800370083 +0200
@@ -0,0 +1,127 @@
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-family/c-common.h"
+#include "c-family/c-ubsan.h"
+
+/* Instrument division by zero and INT_MIN / -1.  If not instrumenting,
+   return NULL_TREE.  */
+
+tree
+ubsan_instrument_division (location_t loc, tree op0, tree op1)
+{
+  tree t, tt;
+  tree type = TREE_TYPE (op0);
+
+  /* At this point both operands should have the same type,
+ because they are already converted to RESULT_TYPE.  */
+  gcc_assert (type == TREE_TYPE (op1));
+
+  if (TREE_CODE (type) != INTEGER_TYPE)
+return NULL_TREE;
+
+  /* If we *know* that the divisor is not -1 or 0, we don't have to
+ instrument this expression.
+ ??? We could use decl_constant_value to cover up more cases.  */
+  if (TREE_CODE (op1) == INTEGER_CST
+  && integer_nonzerop (op1)
+  && !integer_minus_onep (op1))
+return NULL_TREE;
+
+  t = fold_build2 (EQ_EXPR, boolean_type_node,
+   op1, build_int_cst (type, 0));
+
+  /* We check INT_MIN / -1 only for signed types.  */
+  if (!TYPE_UNSIGNED (type))
+{
+  tree x;
+  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
+   build_int_cst (type, -1));
+  x = fold_build2 (EQ_EXPR, boolean_type_node, op0,
+  TYPE_MIN_VALUE (type));
+  x = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, x, tt);
+  t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t, x);
+}
+  tt = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW);
+  tt = build_call_expr_loc (loc, tt, 0);
+  t = fold_build3 (COND_EXPR, void_type_node, t, tt, void_zero_node);
+
+  return t;
+}
+
+/* Instrument left and right shifts.  If not instrumenting, return
+   NULL_TREE.  */
+
+tree
+ubsan_instrument_shift (location_t loc, enum tree_code code,
+   tree op0, tree op1)
+{
+  tree t, tt = NULL_TREE;
+  tree op1_utype = unsigned_type_for (TREE_TYPE (op1));
+  HOST_WIDE_INT op0_prec = TYPE_PRECISION (TREE_TYPE (op0));
+  tree uprecm1 = build_int_cst (op1_utype, op0_prec - 1);
+  tree precm1 = build_int_cst (TREE_TYPE (op1), op0_prec - 1);
+
+  t = fold_convert_loc (loc, op1_utype, op1);
+  t = fold_build2 (GT_EXPR, boolean_type_node, t, uprecm1);
+
+  /* For signed x << y, in C99/C11, the following:
+ (unsigned) x >> (precm1 - y)
+ if non-zero, is undefined.  */
+  if (code == LSHIFT_EXPR
+  && !TYPE_UNSIGNED (TREE_TYPE (op0))
+  && flag_isoc99)
+{
+  tree x = fold_build2 (MINUS_EXPR, integer_type_node, precm1, op1);
+  tt = fold_convert_loc (loc, unsigned_type_for (TREE_TYPE (op0)), op0);
+  tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
+  tt = fold_build2 (NE_EXPR, boole

Re: [RS6000] IBM long double little-endian

2013-06-12 Thread Alan Modra
On Wed, Jun 12, 2013 at 11:09:03AM -0400, David Edelsohn wrote:
> On Wed, Jun 12, 2013 at 10:53 AM, Alan Modra  wrote:
> 
> > * config/rs6000/rs6000.h (LONG_DOUBLE_LARGE_FIRST): Define.
> > * config/rs6000/rs6000.md (signbittf2): New insn.
> > (extenddftf2_internal): Use LONG_DOUBLE_LARGE_FIRST.
> > (abstf2_internal, cmptf_internal2): Likewise.
> > * config/rs6000/spe.md (spe_abstf2_cmp, spe_abstf2_tst): Likewise.
> 
> 
> Why create and define LONG_DOUBLE_LARGE_FIRST if it always is "1"
> (True)?  To retain flexibility?

Yes.  We don't have anything like an ABI fixed in stone at the moment.

-- 
Alan Modra
Australia Development Lab, IBM


[RFC] Inconsistency in ordering vector widening operations on big-endian targets?

2013-06-12 Thread Tejas Belagod

Hi,

This test case:

void foo(long long *d, int *f)
{
  int i;
  for (i=0; i< 16; i++)
  {
d[i] = f[i];
  }
}

when vectorized for big-endian mode, generates this sequence of widening 
operations:

  ...
  _33 = (void *) ivtmp.22_25;
  vect__11.5_39 = MEM[base: _33, offset: 0B];
  vect__12.6_40 = [vec_unpack_hi_expr] vect__11.5_39;
  vect__12.6_41 = [vec_unpack_lo_expr] vect__11.5_39;
  _29 = (void *) ivtmp.25_26;
  MEM[base: _29, offset: 0B] = vect__12.6_40;
  _5 = _29 + 16;
  MEM[base: _5, offset: 0B] = vect__12.6_41;
  ...

I tried this on two targets configured for big-endian(aarch64 and powerpc).

From the IR above, it seems that result of widening the high part 
(vect__12.6_40) is being stored at offset 0 from _29 and result of widening the 
low part goes into *(_29 + 16). Shouldn't this be the other way around?


The source of this seems to be code in 
tree-vect-stmst.c:supportable_widening_operation() that swaps the tree codes for 
high and low part widening for Big Endian targets.


if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
{
  enum tree_code ctmp = c1;
  c1 = c2;
  c2 = ctmp;
}

During vectorization of the scalar widening operation, it is transformed into 
two vector operations - one for widening high part and one for widening low part 
and these are stored as a linked list in STMT_VINFO(stmt) of a scalar gimple 
statement. What's interesting here is the order in which they are stored:


  scalar_gimple_stmt.vect_info->vectorized_stmt

points to this list:

  [vec_unpack_hi_expr] vect__11.5_39->[vec_unpack_lo_expr] vect__11.5_39

What happens when vectorizing the store of the widened results is that the 
single store is split into two stores based on the algorithm of ncopies = 
VF/nunits where VF is the vectorization factor and nunits is the number of units 
of the bigger data type. For the first of the stores, when


  vec_oprnd = vect_get_vec_def_for_operand (op, next_stmt, NULL);

is called, vect_get_def returns the stmt_vinfo which is

   [vec_unpack_hi_expr] vect__11.5_39

which gets stored in *(_29 + 0) and for the second store,

   vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, op);

is called which returns the stmt_related_vinfo() which is the second part of the 
vectorized widened operation.


[vec_unpack_lo_expr] vect__11.5_39

Isn't this an inconsistency in ordering stores of high and low parts of a 
widened vector operation? Is there something I'm missing?


Thanks,
Tejas Belagod.
ARM.



Re: [RFC] Implement Undefined Behavior Sanitizer (take 2)

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 05:17:45PM +0200, Marek Polacek wrote:
> @@ -3867,6 +3868,7 @@ cp_build_binary_op (location_t location,
>tree final_type = 0;
>  
>tree result;
> +  tree orig_type = NULL;
>  
>/* Nonzero if this is an operation like MIN or MAX which can
>   safely be computed in short if both args are promoted shorts.
> @@ -3891,6 +3893,15 @@ cp_build_binary_op (location_t location,
>op0 = orig_op0;
>op1 = orig_op1;
>  
> +  /* Remember whether we're doing / or %.  */
> +  bool doing_div_or_mod = false;
> +
> +  /* Remember whether we're doing << or >>.  */
> +  bool doing_shift = false;
> +
> +  /* Tree holding instrumentation expression.  */
> +  tree instrument_expr = NULL;
> +
>if (code == TRUTH_AND_EXPR || code == TRUTH_ANDIF_EXPR
>|| code == TRUTH_OR_EXPR || code == TRUTH_ORIF_EXPR
>|| code == TRUTH_XOR_EXPR)
> @@ -4070,8 +4081,12 @@ cp_build_binary_op (location_t location,
>   {
> enum tree_code tcode0 = code0, tcode1 = code1;
> tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
> +   cop1 = maybe_constant_value (cop1);
>  
> -   warn_for_div_by_zero (location, maybe_constant_value (cop1));
> +   if (!processing_template_decl && tcode0 == INTEGER_TYPE)
> + doing_div_or_mod = true;

Either the !processing_template_decl here is unneeded, or
if you'd check it (and perhaps flag_ubsan too) in this part of code,
then you wouldn't need to check it later.

> @@ -4832,6 +4853,31 @@ cp_build_binary_op (location_t location,
>if (build_type == NULL_TREE)
>  build_type = result_type;
>  
> +  if (flag_ubsan && !processing_template_decl)

But, I'd certainly avoid doing the cp_save_expr/maybe_constant_value
etc. for all the binary operations you don't want to instrument
(thus check (doing_div_or_mod || doing_shift) also).
- 
> +{
> +  /* OP0 and/or OP1 might have side-effects.  */
> +  op0 = cp_save_expr (op0);
> +  op1 = cp_save_expr (op1);
> +  op0 = maybe_constant_value (fold_non_dependent_expr_sfinae (op0,
> +   tf_none));
> +  op1 = maybe_constant_value (fold_non_dependent_expr_sfinae (op1,
> +   tf_none));
> +  if (doing_div_or_mod)
> + {
> +   /* For diagnostics we want to use the promoted types without
> +  shorten_binary_op.  So convert the arguments to the
> +  original result_type.  */
> +   if (orig_type != NULL && result_type != orig_type)
> + {
> +   op0 = cp_convert (orig_type, op0, complain);
> +   op1 = cp_convert (orig_type, op1, complain);

And you don't want to change op0/op1, have your own tree vars, assign
op{0,1} to them and change here if result_type is not orig_type,
then pass those vars to ubsan_instrument_division.

Jakub


Re: [RS6000] IBM long double little-endian

2013-06-12 Thread David Edelsohn
What is your model for the way that the RTL and C statements in the
new signbittf2 pattern interact?

The C preparation statements are executed before the RTL code
generated from the RTL template.  In the patch, it seems that the new
pattern is assuming that it can rely on some results produced by the
RTL template.  Maybe this seems to work because data dependence
reorders the statements when compiled with optimization.

Thanks, David


Re: Remove self-assignments

2013-06-12 Thread Jeff Law

On 06/12/13 02:03, Richard Biener wrote:


DSE looks like the right place to me as we are removing a store.  Yes,
DCE removes a limited set of stores as well, but the way we remove this kind
of store makes it much more suited to DSE.

As of possibly trapping/throwing stores, we do not bother to preserve those
(even with -fnon-call-exceptions).
A bit of a surprise to hear that.  I don't mind much though, as 
-fnon-call-exceptions isn't something I find terribly useful.


Consider my "objections" withdrawn.
Jeff


Re: Remove self-assignments

2013-06-12 Thread Jeff Law

On 06/11/13 13:30, Marc Glisse wrote:




I'd be curious how often this triggers in GCC itself as well.


Do you know a convenient way to test that?
I usually put in some kind of debugging printfs during early development 
which I can then grep for in build logs.  Not very sexy, but effective 
to see if a particular transformation is triggering outside contrived 
testcodes.


jeff



Re: [RFC] Implement Undefined Behavior Sanitizer (take 2)

2013-06-12 Thread Marek Polacek
On Wed, Jun 12, 2013 at 05:29:21PM +0200, Jakub Jelinek wrote:
> > @@ -4070,8 +4081,12 @@ cp_build_binary_op (location_t location,
> > {
> >   enum tree_code tcode0 = code0, tcode1 = code1;
> >   tree cop1 = fold_non_dependent_expr_sfinae (op1, tf_none);
> > + cop1 = maybe_constant_value (cop1);
> >  
> > - warn_for_div_by_zero (location, maybe_constant_value (cop1));
> > + if (!processing_template_decl && tcode0 == INTEGER_TYPE)
> > +   doing_div_or_mod = true;
> 
> Either the !processing_template_decl here is unneeded, or
> if you'd check it (and perhaps flag_ubsan too) in this part of code,
> then you wouldn't need to check it later.

Fixed.

> > @@ -4832,6 +4853,31 @@ cp_build_binary_op (location_t location,
> >if (build_type == NULL_TREE)
> >  build_type = result_type;
> >  
> > +  if (flag_ubsan && !processing_template_decl)
> 
> But, I'd certainly avoid doing the cp_save_expr/maybe_constant_value
> etc. for all the binary operations you don't want to instrument
> (thus check (doing_div_or_mod || doing_shift) also).

Of course.  Fixed.

> > +{
> > +  /* OP0 and/or OP1 might have side-effects.  */
> > +  op0 = cp_save_expr (op0);
> > +  op1 = cp_save_expr (op1);
> > +  op0 = maybe_constant_value (fold_non_dependent_expr_sfinae (op0,
> > + tf_none));
> > +  op1 = maybe_constant_value (fold_non_dependent_expr_sfinae (op1,
> > + tf_none));
> > +  if (doing_div_or_mod)
> > +   {
> > + /* For diagnostics we want to use the promoted types without
> > +shorten_binary_op.  So convert the arguments to the
> > +original result_type.  */
> > + if (orig_type != NULL && result_type != orig_type)
> > +   {
> > + op0 = cp_convert (orig_type, op0, complain);
> > + op1 = cp_convert (orig_type, op1, complain);
> 
> And you don't want to change op0/op1, have your own tree vars, assign
> op{0,1} to them and change here if result_type is not orig_type,
> then pass those vars to ubsan_instrument_division.

Like this?

2013-06-12  Marek Polacek  

* Makefile.in: Add ubsan.c.
* common.opt: Add -fsanitize=undefined option.
* doc/invoke.texi: Document the new flag.
* sanitizer.def (DEF_SANITIZER_BUILTIN): Define.
* builtin-attrs.def (ATTR_COLD): Define.
* asan.c (initialize_sanitizer_builtins): Build
BT_FN_VOID_PTR_PTR_PTR.
* builtins.def (BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW,
BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS): Define.

c-family/
* c-ubsan.c: New file.
* c-ubsan.h: New file.

cp/
* typeck.c (cp_build_binary_op): Add division by zero and shift
instrumentation.

c/
* c-typeck.c (build_binary_op): Add division by zero and shift
instrumentation.

--- gcc/c-family/c-ubsan.c.mp   2013-06-11 19:51:55.555492466 +0200
+++ gcc/c-family/c-ubsan.c  2013-06-12 17:05:20.800370083 +0200
@@ -0,0 +1,127 @@
+/* UndefinedBehaviorSanitizer, undefined behavior detector.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+   Contributed by Marek Polacek 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "c-family/c-common.h"
+#include "c-family/c-ubsan.h"
+
+/* Instrument division by zero and INT_MIN / -1.  If not instrumenting,
+   return NULL_TREE.  */
+
+tree
+ubsan_instrument_division (location_t loc, tree op0, tree op1)
+{
+  tree t, tt;
+  tree type = TREE_TYPE (op0);
+
+  /* At this point both operands should have the same type,
+ because they are already converted to RESULT_TYPE.  */
+  gcc_assert (type == TREE_TYPE (op1));
+
+  if (TREE_CODE (type) != INTEGER_TYPE)
+return NULL_TREE;
+
+  /* If we *know* that the divisor is not -1 or 0, we don't have to
+ instrument this expression.
+ ??? We could use decl_constant_value to cover up more cases.  */
+  if (TREE_CODE (op1) == INTEGER_CST
+  && integer_nonzerop (op1)
+  && !integer_minus_onep (op1))
+return NULL_TREE;
+
+  t = fold_build2 (EQ_EXPR, boolean_type_node,
+   op1, build_int_cst (type, 0));
+
+  /* We check INT_MIN / -1 only for signed types.  */
+  if (!TYPE_UNSIGNED (type))
+{
+  tree x;
+  tt

Re: [Patch, Fortran] Print floating-point exception status after STOP/ERROR STOP

2013-06-12 Thread David Edelsohn
On Wed, Jun 12, 2013 at 9:05 AM, Tobias Burnus  wrote:
> David: Can you have a look at libgfortran/config/fpu-aix.h - Thanks!
> Uros: Can you have a look at libgfortran/config/fpu-387.h - Thanks!

The AIX bits look correct.  Thanks very much for investigating AIX support.

Thanks, David


Re: Remove self-assignments

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 09:42:55AM -0600, Jeff Law wrote:
> On 06/12/13 02:03, Richard Biener wrote:
> >DSE looks like the right place to me as we are removing a store.  Yes,
> >DCE removes a limited set of stores as well, but the way we remove this kind
> >of store makes it much more suited to DSE.
> >
> >As of possibly trapping/throwing stores, we do not bother to preserve those
> >(even with -fnon-call-exceptions).
> A bit of a surprise to hear that.  I don't mind much though, as
> -fnon-call-exceptions isn't something I find terribly useful.

Well, a segfault accessing invalid pointer is undefined behavior, so it is
fine to optimize it away.

Jakub


Re: [Patch, Fortran] Print floating-point exception status after STOP/ERROR STOP

2013-06-12 Thread Tobias Burnus

Updated version:
* Uros suggestions are incorporated
* Changed from -f(no-)underflow-warning to 
-ffpe-summary=[none,all,underflow,...]


Tobias Burnus wrote:

David: Can you have a look at libgfortran/config/fpu-aix.h - Thanks!

The attached patch causes gfortran-compiled programs to print warnings 
like
Note: The following floating-point exception are signalling: 
IEEE_DIVIDE_BY_ZERO
when STOP / ERROR STOP is invoked. That's required by Fortran 2008 
(8.4 STOP and ERROR STOP statements):


"If any exception (14) is signaling on that image, the processor shall 
issue a warning indicating which exceptions are signaling; this 
warning shall be on the unit identified by the named constant ERROR 
UNIT (13.8.2.8)."


One surely could extend it to allow to completely disable the warning 
- or to make it more fine grained like "none", "all" plus all single 
flags (including underflow, denormal and inexact, where by default one 
leaves out inexact).


Thinking about it, I think that's the better solution: It makes 
(optionally) inexact available and also allows to fully disable the 
feature. I am sure that there are users who would like to have that 
choice. Hence, I update the argument handling and libgfortran's stop.c.


Additions from the J3 list:
* IBM's "XLF compiler has an option to report fp exceptions including 
underflow and inexact.  It is default OFF."

(which matches ifort)


Build and regtested on x86-64-gnu-linux.
OK for the trunk?


Tobias

PS: I filled PR 57598 to track the warning handling for coarrays.
2013-06-12  Tobias Burnus  

	* gfortran.h (gfc_option_t): Add fpe_summary.
	* gfortran.texi (_gfortran_set_options): Update.
	* invoke.texi (-ffpe-summary): Add doc.
	* lang.opt (ffpe-summary): Add flag.
	* options.c (gfc_init_options, gfc_handle_option): Handle it.
	(gfc_handle_fpe_option): Renamed from gfc_handle_fpe_trap_option,
	also handle fpe_summary.
	* trans-decl.c (create_main_function): Update
	_gfortran_set_options call.

2013-06-12  Tobias Burnus  

	* libgfortran.h (compile_options_t) Add fpe_summary.
	(get_fpu_except_flags): New prototype.
	* runtime/compile_options.c (set_options, init_compile_options):
	Handle fpe_summary.
	* runtime/stop.c (report_exception): New function.
	(stop_numeric, stop_numeric_f08, stop_string, error_stop_string,
	error_stop_numeric): Call it.
	* config/fpu-387.h (get_fpu_except_flags): New function.
	* config/fpu-aix.h (get_fpu_except_flags): New function.
	* config/fpu-generic.h (get_fpu_except_flags): New function.
	* config/fpu-glibc.h (get_fpu_except_flags): New function.
	* config/fpu-glibc.h (get_fpu_except_flags): New function.
	* configure.ac: Check for fpxcp.h.
	* configure: Regenerate.
	* config.h.in: Regenerate.

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 14da0af..c11ffdd 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2303,6 +2303,7 @@ typedef struct
   int flag_frontend_optimize;
 
   int fpe;
+  int fpe_summary;
   int rtcheck;
   gfc_fcoarray coarray;
 
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 4a31a77..3f594f2 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -2855,13 +2855,21 @@ Default: enabled.
 are (bitwise or-ed): GFC_RTCHECK_BOUNDS (1), GFC_RTCHECK_ARRAY_TEMPS (2),
 GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (16), GFC_RTCHECK_POINTER (32).
 Default: disabled.
+@item @var{option}[7] @tab Unused.
+@item @var{option}[8] @tab Show a warning when invoking @code{STOP} and
+@code{ERROR STOP} if a floating-point exception occurred. Possible values
+are (bitwise or-ed) @code{GFC_FPE_INVALID} (1), @code{GFC_FPE_DENORMAL} (2),
+@code{GFC_FPE_ZERO} (4), @code{GFC_FPE_OVERFLOW} (8),
+@code{GFC_FPE_UNDERFLOW} (16), @code{GFC_FPE_INEXACT} (32). Default:
+@code{GFC_FPE_INVALID | GFC_FPE_DENORMAL | GFC_FPE_ZERO | GFC_FPE_OVERFLOW
+| GFC_FPE_UNDERFLOW}.
 @end multitable
 
 @item @emph{Example}:
 @smallexample
-  /* Use gfortran 4.8 default options.  */
-  static int options[] = @{68, 511, 0, 0, 1, 1, 0@};
-  _gfortran_set_options (7, &options);
+  /* Use gfortran 4.9 default options.  */
+  static int options[] = @{68, 511, 0, 0, 1, 1, 0, 0, 31@};
+  _gfortran_set_options (9, &options);
 @end smallexample
 @end table
 
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 12c200e..b541204 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -1021,6 +1021,17 @@ be uninteresting in practice.
 
 By default no exception traps are enabled.
 
+@item -ffpe-summary=@var{list}
+@opindex @code{ffpe-summary=}@var{list}
+Specify a list of floating-point exceptions, whose flag status is printed
+to @code{ERROR_UNIT} when invoking @code{STOP} and @code{ERROR STOP}.
+@var{list} can be either @samp{none}, @samp{all} or a comma-separated list
+of the following exceptions: @samp{invalid}, @samp{zero}, @samp{overflow},
+@samp{underflow}, @samp{inexact} and @samp{denormal}. (See
+@option{-ffpe-trap} for a description of the exceptions.)
+
+

Re: [c++-concepts] code review

2013-06-12 Thread Gabriel Dos Reis
On Mon, Jun 10, 2013 at 2:30 PM, Jason Merrill  wrote:
> On 06/08/2013 09:34 AM, Andrew Sutton wrote:
>>
>> I think I previously put constraint_info in lang_decl_min, right next
>> to template_info no less. It was easy to manage there, and initialized
>> as part of build_template_decl. But this obviously doesn't work for
>> partial specializations unless they get template_decls.
>
>
> Right.  And we would want it to be specific to template_decls, not all decls
> that use lang_decl_min.

yes, exactly my feedback on the original implementation.

I am still surprised though that we don't generate TEMPLATE_DECLs for
partial instantiations (since they
are still morally templates.)

-- Gaby


Re: C++ PATCH to warn about undefined functions in anonymous namespace

2013-06-12 Thread Gabriel Dos Reis
On Mon, Jun 10, 2013 at 2:37 PM, Jason Merrill  wrote:
> Since members of the anonymous namespace can't be defined in another
> translation unit, we should treat them like statics for diagnostic purposes.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.

Thank you!

-- Gaby


Re: [Patch, Fortran] Print floating-point exception status after STOP/ERROR STOP

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 03:05:30PM +0200, Tobias Burnus wrote:
> --- a/libgfortran/runtime/stop.c
> +++ b/libgfortran/runtime/stop.c
> @@ -32,6 +32,37 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  #endif
>  
>  
> +/* Fortran 2008 demands: If any exception (14) is signaling on that image, 
> the
> +   processor shall issue a warning indicating which exceptions are signaling;
> +   this warning shall be on the unit identified by the named constant
> +   ERROR_UNIT (13.8.2.8).  In line with other compilers, we do not report
> +   inexact - and we optionally ignore underflow, cf. thread starting at
> +   http://mailman.j3-fortran.org/pipermail/j3/2013-June/006452.html.  */
> +
> +static void
> +report_exception (void)
> +{
> +  int set_excepts = get_fpu_except_flags ();
> +  if (!set_excepts)
> +return;

I think if you want to mask some exceptions based on
compile_options (and yes, I think it is highly desirable to have
some compile option to disable any printout on STOP with no arguments),
then I think you want to mask them before testing if (!set_excepts),
otherwise if say undeflow is the only reported exception, you'd print
Note: The following floating-point status flag is signalling:
(and no exceptions).  So
if (!compile_options.underflow_warning)
  set_excepts &= ~GFC_FPE_UNDERFLOW;
if (!set_excepts)
  return;

would be IMHO better.

Jakub


[C++ Path] PR 38958

2013-06-12 Thread Paolo Carlini

Hi,

this bug is a sort of follow up to 10416, which I fixed some time ago 
and was about avoiding -Wunused warnings for class types with 
destructors with side-effects.


In this issue reporter notes that we don't handle in the same way 
references, thus, considering the testcase, we do not warn for:


Lock lock = AcquireLock();

and we do for:

const Lock& lock = AcquireLock();

whereas the destructor is involved in both cases in a similar way, etc.

Thus I changed the code in poplevel to see through references for 
-Wunused-variable. Tested x86_64-linux.


Thanks,
Paolo.


/cp
2013-06-12  Paolo Carlini  

PR c++/38958
* decl.c (poplevel): For -Wunused-variable see through references.

/testsuite
2013-06-12  Paolo Carlini  

PR c++/38958
* g++.dg/warn/Wunused-var-20.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 200012)
+++ cp/decl.c   (working copy)
@@ -622,18 +622,22 @@ poplevel (int keep, int reverse, int functionbody)
   push_local_binding where the list of decls returned by
   getdecls is built.  */
decl = TREE_CODE (d) == TREE_LIST ? TREE_VALUE (d) : d;
+   tree type = TREE_TYPE (decl);
+   if (VAR_P (decl) && ! TREE_USED (decl))
+ // For -Wunused-variable see through references (PR 38958).
+ type = non_reference (type);
if (VAR_P (decl)
&& (! TREE_USED (decl) || !DECL_READ_P (decl))
&& ! DECL_IN_SYSTEM_HEADER (decl)
&& DECL_NAME (decl) && ! DECL_ARTIFICIAL (decl)
-   && TREE_TYPE (decl) != error_mark_node
-   && (!CLASS_TYPE_P (TREE_TYPE (decl))
-   || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl
+   && type != error_mark_node
+   && (!CLASS_TYPE_P (type)
+   || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)))
  {
if (! TREE_USED (decl))
  warning (OPT_Wunused_variable, "unused variable %q+D", decl);
else if (DECL_CONTEXT (decl) == current_function_decl
-&& TREE_CODE (TREE_TYPE (decl)) != REFERENCE_TYPE
+&& TREE_CODE (type) != REFERENCE_TYPE
 && errorcount == unused_but_set_errorcount)
  {
warning (OPT_Wunused_but_set_variable,
Index: testsuite/g++.dg/warn/Wunused-var-20.C
===
--- testsuite/g++.dg/warn/Wunused-var-20.C  (revision 0)
+++ testsuite/g++.dg/warn/Wunused-var-20.C  (working copy)
@@ -0,0 +1,19 @@
+// PR c++/38958
+// { dg-options "-Wunused" }
+
+volatile int g;
+
+struct Lock
+{
+  ~Lock() { g = 0; }
+};
+
+Lock AcquireLock() { return Lock(); }
+
+int main()
+{
+  const Lock& lock = AcquireLock();
+  g = 1;
+  g = 2;
+  g = 3;
+}


Re: Improve uncprop and coalescing

2013-06-12 Thread Jeff Law


On 06/07/13 03:14, Richard Biener wrote:


+/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for
+   coalescing together, false otherwise.
+
+   This must stay consistent with the code in tree-ssa-live.c which
+   sets up base values in the var map.  */
+
+bool
+gimple_can_coalesce_p (tree name1, tree name2)
+{
+  /* First check the SSA_NAME's associated DECL.  We only want to
+ coalesce if they have the same DECL or both have no associated DECL.
*/
+  if (SSA_NAME_VAR (name1) != SSA_NAME_VAR (name2))
+return false;
+
+  /* Now check the types.  If the types are the same, then we should
+ try to coalesce V1 and V2.  */
+  tree t1 = TREE_TYPE (name1);
+  tree t2 = TREE_TYPE (name2);
+  if (t1 == t2)
+return true;
+
+  /* If the types are not the same, check for a canonical type match.  This
+ (for example) allows coalescing when the types are fundamentally the
+ same, but just have different names.  */
+  if (TYPE_CANONICAL (t1) && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2))


Please use types_compatible_p (t1, t2) here, that's the correct API to use
here.


+return true;
+
+  return false;
+}
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 83a52a0..a624d00 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -111,8 +111,12 @@ var_map_base_init (var_map map)
as it restricts the sets we compute conflicts for.
Using TREE_TYPE to generate sets is the easies as
type equivalency also holds for SSA names with the same
-  underlying decl.  */
-   m->base.from = TREE_TYPE (var);
+  underlying decl.
+
+  Check gimple_can_coalesce_p when changing this code.  */
+   m->base.from = (TYPE_CANONICAL (TREE_TYPE (var))
+   ? TYPE_CANONICAL (TREE_TYPE (var))
+   : TREE_TYPE (var));


eh, but it's made complicated here ... so above do


if (TYPE_CANONICAL (t1) && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
 && types_compatible_p (t1, t2))

because looking at useless_type_conversion_p it looks like pointer types
with different address-spaces may have the same canonical type.  A comment
on why we check both, refering to var_map_base_init should also be added.
Reading this again after a night of sleep, it appears you're agreeing 
that we can't just use types_compatible_p to drive what objects are put 
on the coalesce list.  The only code change you're asking for is to make 
sure we properly reject pointer types with different address spaces 
(which can be done via types_compatible_p).



Right?

jeff




Re: [PATCH] Cilk Plus Array Notation for C++

2013-06-12 Thread Aldy Hernandez

[Jason/Richard: there are some things below I could use your feedback on.]

Hi Balaji.

Overall, a lot of the stuff in cp-array-notation.c looks familiar from 
the C front-end changes.  Can't you reuse a lot of it?


Otherwise, here are some minor nits...


+  /* If the function call is builtin array notation function then we do not
+need to do any type conversion.  */
+  if (flag_enable_cilkplus && fn && TREE_CODE (fn) == FUNCTION_DECL
+ && DECL_NAME (fn) && IDENTIFIER_POINTER (DECL_NAME (fn))
+ && !strncmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__sec_reduce", 12))
+   val = arg;


Don't we have BUILT_IN_CILKPLUS_SEC_REDUCE* now?  So you shouldn't need 
to poke at the actual identifier.  And even so, won't the above strncmp 
match __sec_reducegarbage?



+/* This function parses Cilk Plus array notations.  The starting index is
+   passed in INIT_INDEX and the array name is passed in ARRAY_VALUE.  The
+   return value of this function is a tree node called VALUE_TREE of type
+   ARRAY_NOTATION_REF.  If some error occurred it returns error_mark_node.  */
+


It looks like a NULL in INIT_INDEX is a specially handled case.  Perhaps 
you should document that INIT_INDEX can be null and what it means. 
Also, you don't need to document what internal variable name you are 
using as a return value (VALUE_TREE).  Perhaps instead of "The return 
value..." you could write "This function returns the ARRAY_NOTATION_REF 
node." or something like it.



+case ARRAY_NOTATION_REF:
+  {
+   tree start_index, length, stride;
+   op1 = tsubst_non_call_postfix_expression (ARRAY_NOTATION_ARRAY (t),
+ args, complain, in_decl);
+   start_index = RECUR (ARRAY_NOTATION_START (t));
+   length = RECUR (ARRAY_NOTATION_LENGTH (t));
+   stride = RECUR (ARRAY_NOTATION_STRIDE (t));
+
+   /* We do type-checking here for templatized array notation triplets.  */
+   if (!TREE_TYPE (start_index)
+   || !INTEGRAL_TYPE_P (TREE_TYPE (start_index)))
+ {
+   error_at (loc, "start-index of array notation triplet is not an "
+ "integer");
+   RETURN (error_mark_node);
+ }
+   if (!TREE_TYPE (length) || !INTEGRAL_TYPE_P (TREE_TYPE (length)))
+ {
+   error_at (loc, "length of array notation triplet is not an "
+ "integer");
+   RETURN (error_mark_node);
+ }
+   if (!TREE_TYPE (stride) || !INTEGRAL_TYPE_P (TREE_TYPE (stride)))
+ {
+   error_at (loc, "stride of array notation triplet is not an "
+ "integer");
+   RETURN (error_mark_node);
+ }
+   if (TREE_CODE (TREE_TYPE (op1)) == FUNCTION_TYPE)
+ {
+   error_at (loc, "array notations cannot be used with function type");
+   RETURN (error_mark_node);
+ }
+   RETURN (build_array_notation_ref (EXPR_LOCATION (t), op1, start_index,
+ length, stride, TREE_TYPE (op1)));
+  }



You do all this type checking here, but aren't you doing the same type 
checking in build_array_notation_ref() which you're going to call 
anyway?  It looks like there is some code duplication going on.


Also, I see you have a build_array_notation_ref() in 
cp/cp-array-notation.c and also in c/c-array-notation.c.  Can you not 
implement one function that handles both C and C++, or at the very least 
reuse some of the common things?


You are missing a ChangeLog entry for the above snippet.


+  /* If the return expr. has a builtin array notation function, then its
+OK.  */
+  if (rank >= 1)
+   {
+ error_at (input_location, "array notation expression cannot be "
+   "used as a return value");
+ return error_mark_node;
+   }


The comment doesn't seem to match the code, or am I missing something?


+  /* If find_rank returns false,  then it should have reported an error,


Extra whitespace.


+  if (rank > 1)
+   {
+ error_at (loc, "rank of the array%'s index is greater than 1");
+ return error_mark_node;
+   }


No corresponding test.


+  /* If we are dealing with built-in array notation function then we don't need
+ to convert them. They will be broken up into modify exprs in future,
+ during which all these checks will be done.  */


Line too long, please wrap.

There are various lines throughout your patch that are pretty long (both 
in code and in ChangeLog entries).  I don't know what the official GNU 
guidelines say, but what I usually see as prior art in the GCC code base 
is something along the lines of wrapping around column 72.  Perhaps 
someone can pontificate on this, but lines reaching the 78-80 columns 
look pretty darn long to me.



diff --git gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit_ex.c 
gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit_

Re: [c++-concepts] code review

2013-06-12 Thread Jason Merrill

On 06/12/2013 11:53 AM, Gabriel Dos Reis wrote:

I am still surprised though that we don't generate TEMPLATE_DECLs for
partial instantiations (since they are still morally templates.)


Yes, we should.

Jason



Re: [C++ Path] PR 38958

2013-06-12 Thread Jason Merrill

On 06/12/2013 11:58 AM, Paolo Carlini wrote:

-&& TREE_CODE (TREE_TYPE (decl)) != REFERENCE_TYPE
+&& TREE_CODE (type) != REFERENCE_TYPE


This change is wrong; we specifically want to suppress the 
unused-but-set warning for reference variables.  Drop it and the patch 
is OK.


Jason



[PATCH][ARM][testsuite] Add 'dg-require-effective-target sync_*' to some atomic tests

2013-06-12 Thread Meador Inge
Hi All,

This patch adds the needed 'dg-require-effective-target sync_*' lines to some
of the atomic tests so that they will not be run if the appropriate atomic
support is not available.

OK for trunk?

2013-06-12  Meador Inge  

* gcc.dg/atomic-flag.c: Add dg-require-effective-target sync_*.
* g++.dg/simulate-thread/atomics-2.C: Likewise.
* g++.dg/simulate-thread/atomics-1.C: Likewise.

Index: gcc/testsuite/gcc.dg/atomic-flag.c
===
--- gcc/testsuite/gcc.dg/atomic-flag.c	(revision 199961)
+++ gcc/testsuite/gcc.dg/atomic-flag.c	(working copy)
@@ -1,5 +1,6 @@
 /* Test __atomic routines for existence and execution.  */
 /* { dg-do run } */
+/* { dg-require-effective-target sync_char_short } */
 
 /* Test that __atomic_test_and_set and __atomic_clear builtins execute.  */
 
Index: gcc/testsuite/g++.dg/simulate-thread/atomics-2.C
===
--- gcc/testsuite/g++.dg/simulate-thread/atomics-2.C	(revision 199961)
+++ gcc/testsuite/g++.dg/simulate-thread/atomics-2.C	(working copy)
@@ -1,6 +1,7 @@
 /* { dg-do link } */
 /* { dg-options "-std=c++0x" } */
 /* { dg-final { simulate-thread } } */
+/* { dg-require-effective-target sync_int_long } */
 
 using namespace std;
 
Index: gcc/testsuite/g++.dg/simulate-thread/atomics-1.C
===
--- gcc/testsuite/g++.dg/simulate-thread/atomics-1.C	(revision 199961)
+++ gcc/testsuite/g++.dg/simulate-thread/atomics-1.C	(working copy)
@@ -1,6 +1,8 @@
 /* { dg-do link } */
 /* { dg-options "-std=c++0x" } */
 /* { dg-final { simulate-thread } } */
+/* { dg-require-effective-target sync_char_short } */
+/* { dg-require-effective-target sync_int_long } */
 
 /* Test that atomic int and atomic char work properly.  */
 


RE: [PATCH] Cilk Plus Array Notation for C++

2013-06-12 Thread Iyer, Balaji V
Hi Aldy,
Below are my responses to a couple of the things you pointed out.

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Aldy Hernandez [mailto:al...@redhat.com]
> Sent: Wednesday, June 12, 2013 12:34 PM
> To: Iyer, Balaji V
> Cc: gcc-patches@gcc.gnu.org; Jason Merrill (ja...@redhat.com);
> r...@redhat.com
> Subject: Re: [PATCH] Cilk Plus Array Notation for C++
> 
> [Jason/Richard: there are some things below I could use your feedback on.]
> 
> Hi Balaji.
> 
> Overall, a lot of the stuff in cp-array-notation.c looks familiar from the C 
> front-
> end changes.  Can't you reuse a lot of it?

I looked into trying to combine many functionality. The issue that prohibited 
me was templates and extra trees. For example, IF_STMT, FOR_STMT, MODOP_EXPR, 
etc are not available in C but are in C++. So, I had to add this additional 
check for those. Also, if we are processing templates we have to create 
different kind of trees (e.g MODOP_EXPR intead of MODIFY_EXPR).

One way to do it is to break up the places where I am using C++ specific code 
and add a language hook to handle those. I tried doing that a while back and 
the whole thing looked a lot messy and I would imagine it would be hard to 
debug them in future (...atleast for me). This looked organized for me, even 
though a few code looks repeated. Also, the function names are repeated because 
they do similar things in C and C++ only thing is that the body of the function 
is different. 

> 
> > +case ARRAY_NOTATION_REF:
> > +  {
> > +   tree start_index, length, stride;
> > +   op1 = tsubst_non_call_postfix_expression (ARRAY_NOTATION_ARRAY
> (t),
> > + args, complain, in_decl);
> > +   start_index = RECUR (ARRAY_NOTATION_START (t));
> > +   length = RECUR (ARRAY_NOTATION_LENGTH (t));
> > +   stride = RECUR (ARRAY_NOTATION_STRIDE (t));
> > +
> > +   /* We do type-checking here for templatized array notation triplets.  */
> > +   if (!TREE_TYPE (start_index)
> > +   || !INTEGRAL_TYPE_P (TREE_TYPE (start_index)))
> > + {
> > +   error_at (loc, "start-index of array notation triplet is not an "
> > + "integer");
> > +   RETURN (error_mark_node);
> > + }
> > +   if (!TREE_TYPE (length) || !INTEGRAL_TYPE_P (TREE_TYPE (length)))
> > + {
> > +   error_at (loc, "length of array notation triplet is not an "
> > + "integer");
> > +   RETURN (error_mark_node);
> > + }
> > +   if (!TREE_TYPE (stride) || !INTEGRAL_TYPE_P (TREE_TYPE (stride)))
> > + {
> > +   error_at (loc, "stride of array notation triplet is not an "
> > + "integer");
> > +   RETURN (error_mark_node);
> > + }
> > +   if (TREE_CODE (TREE_TYPE (op1)) == FUNCTION_TYPE)
> > + {
> > +   error_at (loc, "array notations cannot be used with function type");
> > +   RETURN (error_mark_node);
> > + }
> > +   RETURN (build_array_notation_ref (EXPR_LOCATION (t), op1,
> start_index,
> > + length, stride, TREE_TYPE (op1)));
> > +  }
> 
> 
> You do all this type checking here, but aren't you doing the same type 
> checking
> in build_array_notation_ref() which you're going to call anyway?  It looks 
> like
> there is some code duplication going on.

The reason why we do this second type checking here is because we don't know 
what they could be when we are parsing it. For example, in:

T x,y,z
A[x:y:z]

x, y, z could be floats and that should be flagged as error, but if x, y z are 
ints, then its ok. We don't know this information until we hit this spot in pt.c

> 
> Also, I see you have a build_array_notation_ref() in cp/cp-array-notation.c 
> and
> also in c/c-array-notation.c.  Can you not implement one function that handles
> both C and C++, or at the very least reuse some of the common things?

I looked into that also, but templates got in the way.

> 
> You are missing a ChangeLog entry for the above snippet.

That I will put in.


> 
> > +  XDELETEVEC (compare_expr);
> > +  XDELETEVEC (expr_incr);
> > +  XDELETEVEC (ind_init);
> > +  XDELETEVEC (array_var);
> > +
> > +  for (ii = 0; ii < list_size; ii++)
> > +{
> > +  XDELETEVEC (count_down[ii]);
> > +  XDELETEVEC (array_value[ii]);
> > +  XDELETEVEC (array_stride[ii]);
> > +  XDELETEVEC (array_length[ii]);
> > +  XDELETEVEC (array_start[ii]);
> > +  XDELETEVEC (array_ops[ii]);
> > +  XDELETEVEC (array_vector[ii]);
> > +}
> > +
> > +  XDELETEVEC (count_down);
> > +  XDELETEVEC (array_value);
> > +  XDELETEVEC (array_stride);
> > +  XDELETEVEC (array_length);
> > +  XDELETEVEC (array_start);
> > +  XDELETEVEC (array_ops);
> > +  XDELETEVEC (array_vector);
> 
> I see a lot of this business going on.  Perhaps one of the core
> maintainers can comment, but I would rather use an obstack, and avoid
> having to keep track of all these little buckets-- which seems rather
> error prone, and then free the o

Re: [gomp4] Some progress on #pragma omp simd

2013-06-12 Thread Richard Henderson
On 04/27/2013 11:17 AM, Jakub Jelinek wrote:
> where simd_uid would be some say integer constant, unique to the simd loop
> (at least unique within the same function, and perhaps inlining/LTO would
> need to remap).

If all we need is uniqueness, then perhaps an otherwise unused decl would do?
We're already prepared to remap those during inlining/LTO...

> treat arrays indexed by __builtin_omp.simd_lane (simd_uid) (dot in the name 
> just
> to make it impossible to be used by users) (or marked with some special
> hidden attribute or something)

I see

/* This file specifies a list of internal "functions".  These functions
   differ from built-in functions in that they have no linkage and cannot
   be called directly by the user.  They represent operations that are only
   synthesised by GCC itself.

and think that may be more applicable than adding dots to regular builtins.


r~


[PATCH] for for c/PR57541

2013-06-12 Thread Iyer, Balaji V
Hello Everyone,
Attach, please find a patch that will fix the issues in C/PR57541. The 
issue reported was that the parameters passed into the builtin array notation 
reduction functions were not checked correctly. This patch should fix that 
issue. It is tested on x86 and x86_64 and it seem to pass all the tests. I have 
also included a testsuite. 

Here are the ChangeLog entries:

gcc/c/ChangeLog
2013-06-12  Balaji V. Iyer  

* c-array-notation.c (fix_builtin_array_notation_fn): Added a call to
valid_no_reduce_fn_params_p and valid_reduce_fn_params_p.
(build_array_notation_expr): Added a check for capturing the return
value (i.e. void) of __sec_reduce_mutating function.


gcc/c-family/ChangeLog:
2013-06-12  Balaji V. Iyer  

* array-notation-common.c (valid_reduce_fn_params_p): New function.
(valid_no_reduce_fn_params_p): Likewise.
* c-common.h (valid_reduce_fn_params_p): Added a new prototype.
(valid_no_reduce_fn_params_p): Likewise.

gcc/testsuite/ChangeLog
2013-06-12  Balaji V. Iyer  

PR c/57541
* c-c++-common/cilk-plus/AN/pr57541-2.c: New test.
* c-c++-common/cilk-plus/AN/rank_mismatch2.c: Fixed a bug by replacing
a comma with an operation.


Thanks,

Balaji V. Iyer.
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
old mode 100644
new mode 100755
index 3d8f68f..52a58a6
Binary files a/gcc/c-family/ChangeLog and b/gcc/c-family/ChangeLog differ
diff --git a/gcc/c-family/array-notation-common.c 
b/gcc/c-family/array-notation-common.c
old mode 100644
new mode 100755
index 489b67c..6c1c7e2
--- a/gcc/c-family/array-notation-common.c
+++ b/gcc/c-family/array-notation-common.c
@@ -560,3 +560,125 @@ find_correct_array_notation_type (tree op)
 } 
   return return_type;
 }
+
+/* Returns true if the function call in BUILTIN_FN (of type CALL_EXPR) if the
+   number of parameters for the array notation reduction functions are
+   correct.  */
+
+bool
+valid_no_reduce_fn_params_p (tree builtin_fn)
+{
+  switch (is_cilkplus_reduce_builtin (CALL_EXPR_FN (builtin_fn)))
+{
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ADD:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MUL:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_ZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN_IND:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX_IND:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
+  if (call_expr_nargs (builtin_fn) != 1)
+   {
+ error_at (EXPR_LOCATION (builtin_fn),
+   "builtin function %qE can only have one argument",
+   CALL_EXPR_FN (builtin_fn));
+ return false;
+   }
+  break;
+case BUILT_IN_CILKPLUS_SEC_REDUCE:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MUTATING:
+  if (call_expr_nargs (builtin_fn) != 3)
+   {
+ error_at (EXPR_LOCATION (builtin_fn),
+   "builtin function %qE must have 3 arguments",
+   CALL_EXPR_FN (builtin_fn));
+ return false;
+   }
+  break;
+default:
+  /* If it is not a builtin function, then no reason for us do any checking
+here.  */
+  return true;
+}
+  return true;
+}
+
+/* Returns true if the parameters of BUILTIN_FN (array notation builtin
+   function): IDENTITY_VALUE and FUNC_PARM are valid.  */
+
+bool
+valid_reduce_fn_params_p (tree builtin_fn, tree identity_value, tree func_parm)
+{
+  switch (is_cilkplus_reduce_builtin (CALL_EXPR_FN (builtin_fn)))
+{
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ADD:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MUL:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_ZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN_IND:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX_IND:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
+case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
+  func_parm = CALL_EXPR_ARG (builtin_fn, 0);
+  if (!contains_array_notation_expr (func_parm))
+   {
+ error_at (EXPR_LOCATION (builtin_fn),
+   "builtin function %qE must contain one argument with "
+   "array notations", CALL_EXPR_FN (builtin_fn));
+ return false;
+   }
+  if (TREE_CODE (func_parm) == CALL_EXPR
+ && is_cilkplus_reduce_builtin (CALL_EXPR_FN (func_parm)) !=
+ BUILT_IN_NONE)
+   {
+ error_at (EXPR_LOCATION (builtin_fn),
+   "builtin functions cannot be used as arguments for %qE",
+   CALL_EXPR_FN (builtin_fn));
+ return false;
+   }
+  return true;
+case BUILT_IN_CILKPLUS_SEC_REDUCE:
+  if (!contains_array_notation_expr (func_parm))
+   {
+ error_at (EXPR_L

Re: [C++ Path] PR 38958

2013-06-12 Thread Paolo Carlini


Hi,

Jason Merrill  ha scritto:

>On 06/12/2013 11:58 AM, Paolo Carlini wrote:
>> - && TREE_CODE (TREE_TYPE (decl)) != REFERENCE_TYPE
>> + && TREE_CODE (type) != REFERENCE_TYPE
>
>This change is wrong; we specifically want to suppress the
>unused-but-set warning for reference variables.  Drop it and the patch
>is OK.

I understand the general issue, but I'm not sure we can't use 'type' here: we 
don't even consider emitting the unused-but-set warning when TREE_USED (decl) 
is false and in this case I don't call non_reference at the outset. I think ;)

Paolo



Re: [gomp4] Some progress on #pragma omp simd

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 10:21:53AM -0700, Richard Henderson wrote:
> On 04/27/2013 11:17 AM, Jakub Jelinek wrote:
> > where simd_uid would be some say integer constant, unique to the simd loop
> > (at least unique within the same function, and perhaps inlining/LTO would
> > need to remap).
> 
> If all we need is uniqueness, then perhaps an otherwise unused decl would do?
> We're already prepared to remap those during inlining/LTO...

So the built-ins would take address of this decl, something else?
Then there is the _simduid_ clause (also can hold address of the decl), and
after lowering it lives only in loop structure (so perhaps
remove_unused_locals would need to mark the decls referenced from loop
structure as used?).

> > treat arrays indexed by __builtin_omp.simd_lane (simd_uid) (dot in the name 
> > just
> > to make it impossible to be used by users) (or marked with some special
> > hidden attribute or something)
> 
> I see
> 
> /* This file specifies a list of internal "functions".  These functions
>differ from built-in functions in that they have no linkage and cannot
>be called directly by the user.  They represent operations that are only
>synthesised by GCC itself.
> 
> and think that may be more applicable than adding dots to regular builtins.

I can certainly try to use internal function instead of builtin with dot in
name, will report later if it is possible and how much changes would it
need.

Jakub


Re: [gomp4] Some progress on #pragma omp simd

2013-06-12 Thread Richard Henderson
On 06/12/2013 10:30 AM, Jakub Jelinek wrote:
> So the built-ins would take address of this decl, something else?

Perhaps address, perhaps just referenced uninitialized?

> Then there is the _simduid_ clause (also can hold address of the decl), and
> after lowering it lives only in loop structure (so perhaps
> remove_unused_locals would need to mark the decls referenced from loop
> structure as used?).

But that simd_uid clause refers to the same decl as the builtins, so the
builtins should keep the decl around, at least until they themselves are
transformed.  At which point the decl is no longer needed, no?

Indeed, I am really hoping that the decl vanishes completely before rtl.


r~


Re: [C++ Path] PR 38958

2013-06-12 Thread Paolo Carlini


Humpf,

>I understand the general issue, but I'm not sure we can't use 'type'
>here: we don't even consider emitting the unused-but-set warning when
>TREE_USED (decl) is false and in this case I don't call non_reference
>at the outset.

I meant *only* in this case I call non_reference, you see my point.

Rewording: as the one line comment says, I only call non_reference at the 
outset when I know that either we'll end up producing the normal 
unused-variable warning or nothing at all.

Paolo



Re: [PATCH] Cilk Plus Array Notation for C++

2013-06-12 Thread Aldy Hernandez



Overall, a lot of the stuff in cp-array-notation.c looks familiar
from the C front- end changes.  Can't you reuse a lot of it?


I looked into trying to combine many functionality. The issue that
prohibited me was templates and extra trees. For example, IF_STMT,
FOR_STMT, MODOP_EXPR, etc are not available in C but are in C++. So,
I had to add this additional check for those. Also, if we are
processing templates we have to create different kind of trees (e.g
MODOP_EXPR intead of MODIFY_EXPR).


I see.



One way to do it is to break up the places where I am using C++
specific code and add a language hook to handle those. I tried doing
that a while back and the whole thing looked a lot messy and I would
imagine it would be hard to debug them in future (...atleast for me).
This looked organized for me, even though a few code looks repeated.


That's what I had in mind, but if you tried it and it looks worse, I 
guess I can live with it.



You do all this type checking here, but aren't you doing the same
type checking in build_array_notation_ref() which you're going to
call anyway?  It looks like there is some code duplication going
on.


The reason why we do this second type checking here is because we
don't know what they could be when we are parsing it. For example,
in:


Couldn't you abstract the type checking out into a helper function 
shared by both routines?



Also, I see you have a build_array_notation_ref() in
cp/cp-array-notation.c and also in c/c-array-notation.c.  Can you
not implement one function that handles both C and C++, or at the
very least reuse some of the common things?


I looked into that also, but templates got in the way.


Ughh... ok, I'll let Jason deal with this then.


+  XDELETEVEC (compare_expr); +  XDELETEVEC (expr_incr); +
XDELETEVEC (ind_init); +  XDELETEVEC (array_var); + +  for (ii =
0; ii < list_size; ii++) +{ +  XDELETEVEC
(count_down[ii]); +  XDELETEVEC (array_value[ii]); +
XDELETEVEC (array_stride[ii]); +  XDELETEVEC
(array_length[ii]); +  XDELETEVEC (array_start[ii]); +
XDELETEVEC (array_ops[ii]); +  XDELETEVEC
(array_vector[ii]); +} + +  XDELETEVEC (count_down); +
XDELETEVEC (array_value); +  XDELETEVEC (array_stride); +
XDELETEVEC (array_length); +  XDELETEVEC (array_start); +
XDELETEVEC (array_ops); +  XDELETEVEC (array_vector);


I see a lot of this business going on.  Perhaps one of the core
maintainers can comment, but I would rather use an obstack, and
avoid having to keep track of all these little buckets-- which
seems rather error prone, and then free the obstack all in one
swoop.  But I'll defer to Richard or Jason.



They are temporary variables that are used to store information
necessary for expansion. To me, dynamic arrays seem to be the most
straight-forward way to do it. Changing them would involve pretty
much rewriting the whole thing and thus maybe breaking the stability.
So, if it is not a huge issue, I would like to keep the dynamic
arrays. They are not being used anywhere else just inside the
function.



This is not huge, so don't worry, but XNEWVEC is just a wrapper to 
xmalloc (see include/libiberty.h).  You could do the exact thing with 
XOBNEWVEC and save yourself all the XDELETEVECs, with one obstack_free().


Re: [C++ Path] PR 38958

2013-06-12 Thread Jason Merrill

On 06/12/2013 01:37 PM, Paolo Carlini wrote:

Rewording: as the one line comment says, I only call non_reference at the 
outset when I know that either we'll end up producing the normal 
unused-variable warning or nothing at all.


Oh, I see.  But that's a rather subtle difference; better to have 'type' 
mean something consistent and leave the unused-but-set code checking 
TREE_TYPE (decl).


Jason



Re: [PATCH] DATA_ALIGNMENT vs. DATA_ABI_ALIGNMENT (PR target/56564)

2013-06-12 Thread Edmar Wienskoski
The e500v2 (SPE) hardware is such that if the address of vector (double world
load / stores) are not double world aligned the instruction will trap.

So this alignment is not optional.

Edmar


On Fri, Jun 7, 2013 at 3:43 PM, Richard Henderson  wrote:
> On 06/07/2013 12:25 PM, Jakub Jelinek wrote:
>> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls
>> for optimization purposes beyond ABI mandated levels.  It is fine to emit
>> the vars aligned as much as we want for optimization purposes, but if we
>> can't be sure that references to that decl bind to the definition we
>> increased the alignment on (e.g. common variables, or -fpic code without
>> hidden visibility, weak vars etc.), we can't assume that alignment.
>
> When the linker merges common blocks, it chooses both maximum size and maximum
> alignment.  Thus for any common block for which we can prove the block must
> reside in the module (any executable, or hidden common in shared object), we
> can go ahead and use the increased alignment.
>
> It's only in shared objects with non-hidden common blocks that we have a
> problem, since in that case we may resolve the common block via copy reloc to 
> a
> memory block in another module.
>
> So while decl_binds_to_current_def_p is a good starting point, I think we can
> do a little better with common blocks.  Which ought to take care of those
> vectorization regressions you mention.
>
>> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out
>>align = MAX_OFILE_ALIGNMENT;
>>  }
>>
>> -  /* On some machines, it is good to increase alignment sometimes.  */
>> -  if (! DECL_USER_ALIGN (decl))
>> +  /* On some machines, it is good to increase alignment sometimes.
>> + But as DECL_ALIGN is used both for actually emitting the variable
>> + and for code accessing the variable as guaranteed alignment, we
>> + can only increase the alignment if it is a performance optimization
>> + if the references to it must bind to the current definition.  */
>> +  if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl))
>>  {
>>  #ifdef DATA_ALIGNMENT
>>unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align);
>> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out
>>   }
>>  #endif
>>  }
>> +#ifdef DATA_ABI_ALIGNMENT
>> +  else if (! DECL_USER_ALIGN (decl))
>> +{
>> +  unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), 
>> align);
>> +  /* For backwards compatibility, don't assume the ABI alignment for
>> +  TLS variables.  */
>> +  if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD)
>> + align = data_align;
>> +}
>> +#endif
>
> This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is
> defined, but DATA_ALIGNMENT isn't.  And while I realize you documented it, I
> don't like the restriction that D_A /must/ return something larger than D_A_A.
>  All that means is that in complex cases D_A will have to call D_A_A itself.
>
> I would think that it would be better to rearrange as
>
>   if (!D_U_A)
> {
>   #ifdef D_A_A
>   align = ...
>   #endif
>   #ifdef D_A
>   if (d_b_t_c_d_p)
> align = ...
>   #endif
> }
>
> Why the special case for TLS?  If we really want that special case surely that
> test should go into D_A_A itself, and not here in generic code.
>
>> Bootstrapped/regtested on x86_64-linux and i686-linux.  No idea about other
>> targets, I've kept them all using DATA_ALIGNMENT, which is considered
>> optimization increase only now, if there is some ABI mandated alignment
>> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as
>> well as DATA_ALIGNMENT.
>
> I've had a brief look over the instances of D_A within the tree atm.  Most of
> them carry the cut-n-paste comment "for the same reasons".  These I believe
> never intended an ABI change, and were really only interested in optimization.
>
> But these I think require a good hard look to see if they really intended an
> ABI alignment:
>
> c6x comment explicitly mentions abi
> criscompiler options for alignment -- systemwide or local?
> mmixcomment mentions GETA instruction
> s390comment mentions LARL instruction
> rs6000  SPE and E500 portion of the alignment non-optional?
>
> Relevant port maintainers CCed.
>
>
> r~


Re: Remove self-assignments

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Richard Biener wrote:


On Wed, Jun 12, 2013 at 10:47 AM, Marc Glisse  wrote:


Essentially never. I tried with the fold_stmt version of the patch, and
libstdc++-v3/src/c++98/concept-inst.cc is the only file where it triggers.
Note that the case:
b=*a
*a=b
is already handled by FRE which removes *a=b (copyprop later removes the
dead b=*a and release_ssa removes the unused variable b), it is only *a=*a
that wasn't handled. Now that I look at it, it is a bit surprising that:

struct A {int i;};
void f(A*a,A*b){ A c=*b; *a=c; }
void g(A*a,A*b){ *a=*b; }

gives 2 different .optimized gimple:

  c$i_5 = MEM[(const struct A &)b_2(D)];
  MEM[(struct A *)a_3(D)] = c$i_5;

and:

  *a_2(D) = MEM[(const struct A &)b_3(D)];

Aren't they equivalent? And if so, which form should be preferred?


Well, the first is optimized by SRA to copy element-wise and thus the
loads/stores have is_gimple_reg_type () which require separate loads/stores.
The second is an aggregate copy where we cannot generate SSA temporaries
for the result of the load (!is_gimple_reg_type ()) and thus we are required
to have a single statement.

One of my pending GIMPLE re-org tasks is to always separate loads and
stores and allow SSA names of aggregate type, thus we'd have

tem_1 = MEM[(const struct A &)b_3(D)];
*a_2(D) = tem_1;

even for the 2nd case.  That solves the fact that we are missing an
aggregate copy propagation pass quite nicely.

Yes, you have to watch for not creating (too many) overlapping life-ranges
as out-of-SSA won't be able to assign the temporary aggregate SSA names
to registers but possibly has to allocate costly stack space for them.

Setting SSA_NAME_OCCURS_IN_ABNORMAL_PHI on them solves this
issue in a hacky way.

On my list since about 5 years ... ;)


I was going to ask if I should wait for it (it makes my patch 
unnecessary), but I guess that answers it. Since Jeff seems ok, I 
committed it at r200034.


--
Marc Glisse


RE: [PATCH] Cilk Plus Array Notation for C++

2013-06-12 Thread Iyer, Balaji V


> -Original Message-
> From: Aldy Hernandez [mailto:al...@redhat.com]
> Sent: Wednesday, June 12, 2013 1:40 PM
> To: Iyer, Balaji V
> Cc: gcc-patches@gcc.gnu.org; Jason Merrill (ja...@redhat.com);
> r...@redhat.com
> Subject: Re: [PATCH] Cilk Plus Array Notation for C++
> 
> 
> >> Overall, a lot of the stuff in cp-array-notation.c looks familiar
> >> from the C front- end changes.  Can't you reuse a lot of it?
> >
> > I looked into trying to combine many functionality. The issue that
> > prohibited me was templates and extra trees. For example, IF_STMT,
> > FOR_STMT, MODOP_EXPR, etc are not available in C but are in C++. So, I
> > had to add this additional check for those. Also, if we are processing
> > templates we have to create different kind of trees (e.g MODOP_EXPR
> > intead of MODIFY_EXPR).
> 
> I see.
> 
> >
> > One way to do it is to break up the places where I am using C++
> > specific code and add a language hook to handle those. I tried doing
> > that a while back and the whole thing looked a lot messy and I would
> > imagine it would be hard to debug them in future (...atleast for me).
> > This looked organized for me, even though a few code looks repeated.
> 
> That's what I had in mind, but if you tried it and it looks worse, I guess I 
> can live
> with it.
> 
> >> You do all this type checking here, but aren't you doing the same
> >> type checking in build_array_notation_ref() which you're going to
> >> call anyway?  It looks like there is some code duplication going on.
> >
> > The reason why we do this second type checking here is because we
> > don't know what they could be when we are parsing it. For example,
> > in:
> 
> Couldn't you abstract the type checking out into a helper function shared by
> both routines?


Yes, that I could do. I will fix it in the new  upcoming Array Notation for C++ 
patch.

Thanks,

Balaji V. Iyer.


Re: RFC [MIPS, RS6000] Mangling of IBM long double template literals

2013-06-12 Thread Joseph S. Myers
On Thu, 13 Jun 2013, Alan Modra wrote:

> This is of course an ABI change for any existing little-endian users
> of IBM long double literals in templates.  On powerpc, I think we can
> safely say there are no such users.  However it does look like MIPS
> also uses a variant of IBM long double, and I'm less certain there.

That variant was used for IRIX, for which support was removed over a year 
ago.  It would be good to remove the code that I noted in 
 was thereby made 
obsolete.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [patch] set MULTIARCH_DIRNAME for multilib architectures

2013-06-12 Thread Richard Sandiford
Matthias Klose  writes:
> Index: config/mips/t-linux64
> ===
> --- config/mips/t-linux64 (revision 200012)
> +++ config/mips/t-linux64 (working copy)
> @@ -24,3 +24,13 @@
>   ../lib32$(call 
> if_multiarch,:mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT)) \
>   ../lib$(call if_multiarch,:mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT)) \
>   ../lib64$(call 
> if_multiarch,:mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT))
> +
> +ifneq (,$(findstring abin32,$(target)))
> +MULTIARCH_DIRNAME = $(call 
> if_multiarch,mips64$(MIPS_EL)-linux-gnuabin32$(MIPS_SOFT))
> +else
> +ifneq (,$(findstring abi64,$(target)))
> +MULTIARCH_DIRNAME = $(call 
> if_multiarch,mips64$(MIPS_EL)-linux-gnuabi64$(MIPS_SOFT))
> +else
> +MULTIARCH_DIRNAME = $(call if_multiarch,mips$(MIPS_EL)-linux-gnu$(MIPS_SOFT))
> +endif
> +endif

findstring seems a bit fragile for a full triple.  I think it would
be better to have something similar to the current MIPS_SOFT definition:

MIPS_SOFT = $(if $(strip $(filter MASK_SOFT_FLOAT_ABI, $(target_cpu_default)) 
$(filter soft, $(with_float))),soft)

but for ABIs.  It could then also take with_abi into account.
Maybe something like:

MIPS_ABI = $(or $(with_abi), \
$(if $(filter MIPS_ABI_DEFAULT=ABI_N32, \
  $(target_cpu_default)), n32), \
o32)

(completely untested).

Thanks,
Richard


Re: [C++ Path] PR 38958

2013-06-12 Thread Paolo Carlini

Hi,

On 06/12/2013 07:46 PM, Jason Merrill wrote:

On 06/12/2013 01:37 PM, Paolo Carlini wrote:
Rewording: as the one line comment says, I only call non_reference at 
the outset when I know that either we'll end up producing the normal 
unused-variable warning or nothing at all.
Oh, I see.  But that's a rather subtle difference; better to have 
'type' mean something consistent and leave the unused-but-set code 
checking TREE_TYPE (decl).
Yeah. Earlier today I had something similar in my tree. I'm finishing 
testing the below then.


Thanks,
Paolo.
Index: cp/decl.c
===
--- cp/decl.c   (revision 200012)
+++ cp/decl.c   (working copy)
@@ -622,17 +622,20 @@ poplevel (int keep, int reverse, int functionbody)
   push_local_binding where the list of decls returned by
   getdecls is built.  */
decl = TREE_CODE (d) == TREE_LIST ? TREE_VALUE (d) : d;
+   // See through references for improved -Wunused-variable (PR 38958).
+   tree type = non_reference (TREE_TYPE (decl));
if (VAR_P (decl)
&& (! TREE_USED (decl) || !DECL_READ_P (decl))
&& ! DECL_IN_SYSTEM_HEADER (decl)
&& DECL_NAME (decl) && ! DECL_ARTIFICIAL (decl)
-   && TREE_TYPE (decl) != error_mark_node
-   && (!CLASS_TYPE_P (TREE_TYPE (decl))
-   || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl
+   && type != error_mark_node
+   && (!CLASS_TYPE_P (type)
+   || !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (type)))
  {
if (! TREE_USED (decl))
  warning (OPT_Wunused_variable, "unused variable %q+D", decl);
else if (DECL_CONTEXT (decl) == current_function_decl
+// For -Wunused-but-set-variable leave references alone.
 && TREE_CODE (TREE_TYPE (decl)) != REFERENCE_TYPE
 && errorcount == unused_but_set_errorcount)
  {
Index: testsuite/g++.dg/warn/Wunused-var-20.C
===
--- testsuite/g++.dg/warn/Wunused-var-20.C  (revision 0)
+++ testsuite/g++.dg/warn/Wunused-var-20.C  (working copy)
@@ -0,0 +1,19 @@
+// PR c++/38958
+// { dg-options "-Wunused" }
+
+volatile int g;
+
+struct Lock
+{
+  ~Lock() { g = 0; }
+};
+
+Lock AcquireLock() { return Lock(); }
+
+int main()
+{
+  const Lock& lock = AcquireLock();
+  g = 1;
+  g = 2;
+  g = 3;
+}


[google/gcc-4_8] Port patches to fix ICEs when using -fdebug-types-section.

2013-06-12 Thread Cary Coutant
I've ported the following four patches from the google/gcc-4_7 branch.
(I'll also push these to trunk shortly.)

Bootstrapped, tested, and committed at r200036.

-cary


gcc:

2012-08-09   Cary Coutant  

Backport of pending upstream patch.

* dwarf2out.c (clone_as_declaration): Copy DW_AT_abstract_origin
attribute.
(generate_skeleton_bottom_up): Remove DW_AT_object_pointer attribute
from original DIE.
(clone_tree_hash): Rename to ...
(clone_tree_partial): ... this; change callers.  Copy
DW_TAG_subprogram DIEs as declarations.


2012-08-22   Cary Coutant  

* dwarf2out.c (should_move_die_to_comdat): A type definition
can contain a subprogram definition, but don't move it to a
comdat unit.


2012-08-29   Cary Coutant  

* dwarf2out.c (clone_tree_partial): Remove.
(copy_decls_walk): Don't copy children of a declaration
into a type unit.


2012-08-31   Cary Coutant  

* dwarf2out.c (clone_tree_partial): Restore.
(copy_decls_walk): Call clone_tree_partial to copy children
of non-declaration DIEs.

gcc/testsuite:

2012-08-09   Cary Coutant  

Backport of pending upstream patch.

* g++.dg/debug/dwarf2/dwarf4-nested.C: New test case.
* g++.dg/debug/dwarf2/dwarf4-typedef.C: Add
-fdebug-types-section flag.


Re: [patch, mips] Micromips delay slot fix

2013-06-12 Thread Richard Sandiford
Richard Sandiford  writes:
> "Moore, Catherine"  writes:
>> I'm testing a slightly different patch from the one that Steve posted:
>>
>> Index: mips.md
>> ===
>> --- mips.md (revision 199648)
>> +++ mips.md (working copy)
>> @@ -703,8 +703,13 @@
>>
>>  ;; Is it a single instruction?
>>  (define_attr "single_insn" "no,yes"
>> -  (symbol_ref "(get_attr_length (insn) == (TARGET_MIPS16 ? 2 : 4)
>> -   ? SINGLE_INSN_YES : SINGLE_INSN_NO)"))
>> +  (if_then_else (ior (and (match_test "TARGET_MIPS16")
>> + (match_test "get_attr_length (insn) == 2"))
>> +(and (eq_attr "compression" "micromips,all")
>> + (match_test "TARGET_MICROMIPS"))
>> +(match_test "get_attr_length (insn) == 4"))
>> +   (const_string "yes")
>> +   (const_string "no")))
>
> 4 isn't OK for MIPS16 though.  There's also the problem that Maciej
> pointed out: a length of 4 doesn't imply a single insn on microMIPS.
> E.g. an unsplit doubleword move to or from the accumulator registers
> is a pair of 2-byte microMIPS instructions, so although its overall
> length is 4, it isn't a single insn.  The original code has the same
> problem.  In practice, the split should have happened by dbr_schedule
> time, but it seems bad practice to rely on that.
>
> (FWIW, the MIPS16 definition comes from the historical attitude that
> extended instructions count as 2 instructions.  The *_insns functions
> also follow this counting.)
>
> I'm going to try redefining the length attribute after:
>
> ;; "Ghost" instructions occupy no space.
> (eq_attr "type" "ghost")
> (const_int 0)
>
> in terms of an "insn_count" attribute.  This will conervatively
> count 4 for each microMIPS instruction in an unsplit multi-instruction
> sequence, just as we do now.  Any attempt to change that should be
> a separate patch anyway.

Here's what I checked in after testing on mips64-linux-gnu and
mipsisa32-sde-elf.

Thanks,
Richard


gcc/
* config/mips/mips.md (extended_mips16): Include GOT and constant-pool
loads.
(insn_count): New attribute, with most cases extracted from...
(length): ...here.  Redefine most cases in terms of insn_count.
(single_insn): Delete.
(can_delay): Use insn_count to check for single instructions.
(*mul3_r4300, mul3_r4000, *mul_acc_si, *mul_acc_si_r3900)
(*msac_using_macc, *mul_sub_si, mulsidi3_32bit_r4000)
(mulsidi3_64bit_r4000, muldi3_highpart_internal)
(mulsi3_highpart_split, muldi3_highpart_internal)
(mulditi3_r4000, *div3, *recip3, divmod4)
(udivmod4, sqrt2, *rsqrta, *rsqrtb)
(fix_truncdfsi2_macro, fix_truncsfsi2_macro, *lea_high64)
(*lea64, cprestore_, clear_hazard_, )
(casesi_internal_mips16_, *tls_get_tp__split)
(tls_get_tp_mips16, *tls_get_tp_mips16_call_): Use "insn_count"
rather than "length".
(tls_get_tp_): Likewise.  Remove redundant "no_delay" attribute.
* config/mips/mips-ps-3d.md (mips_c_cond_4s, mips_cabs_cond_4s):
Use "insn_count" rather than "length".
* config/mips/mips-dsp.md
(mips_lx_ext_)
(mips_lx_, *mips_lwx__ext): Remove
length attributes.

gcc/testsuite/
* gcc.target/mips/umips-branch-1.c, gcc.target/mips/umips-branch-2.c:
New tests.

Index: gcc/config/mips/mips.md
===
--- gcc/config/mips/mips.md 2013-06-12 19:39:40.596358495 +0100
+++ gcc/config/mips/mips.md 2013-06-12 19:53:23.139737233 +0100
@@ -407,8 +407,12 @@ (define_attr "cnv_mode" "unknown,I2S,I2D
 
 ;; Is this an extended instruction in mips16 mode?
 (define_attr "extended_mips16" "no,yes"
-  (if_then_else (ior (eq_attr "move_type" "sll0")
-(eq_attr "jal" "direct"))
+  (if_then_else (ior ;; In general, constant-pool loads are extended
+;; instructions.  We don't yet optimize for 16-bit
+;; PC-relative references.
+(eq_attr "move_type" "sll0,loadpool")
+(eq_attr "jal" "direct")
+(eq_attr "got" "load"))
(const_string "yes")
(const_string "no")))
 
@@ -421,14 +425,89 @@ (define_attr "enabled" "no,yes"
  (match_test "TARGET_MICROMIPS")))
(const_string "yes")
(const_string "no")))
-  
-;; Length of instruction in bytes.
-(define_attr "length" ""
-   (cond [(and (eq_attr "extended_mips16" "yes")
-  (match_test "TARGET_MIPS16"))
- (const_int 4)
 
- (and (eq_attr "compression" "micromips,all")
+;; The number of individual instructions that a non-branch pattern generates,
+;; using units of BASE_INSN_LENGTH.
+(define_attr "insn_count" ""
+  (cond [;; "Ghost" instructions

Re: Aw: Re: [PATCH] Basic support for MIPS r5900

2013-06-12 Thread Richard Sandiford
"Jürgen Urban"  writes:
>> > How much other changes will be currently accepted here? There is other
>> > stuff which I want to prepare and submit here, e.g.:
>> > 1. disable use of dmult and ddiv (ABI n32).
>> > 2. use trunc.w.s instead of cvt.w.s (to get single float working for
>> > normal range calculations; i.e. calculating without inf or nan).
>> > 3. fix use of ll/sc in libgomp, either increase mips ISA level or use
>> > syscall (which is broken in Linux 2.6.35.4).
>> > 4. fix libgcc to build a real muldi3 function for ABI n32 (not the
>> > multi3 function which is stored in muldi3.o file).
>> > 5. add support for configure parameters --float=single and
>> > --float=double in addition to --float=soft and --float=hard.
>> > 6. rework floating point to support single float with ABI n32 (either
>> > break the ABI or store floating point values in general purpose
>> > registers like soft float).
>> > 7. change libgcc or mips.md in way so that the non IEEE 754 compatible
>> > FPU of the r5900 gets compatible.
>>
>> Well, I'm afraid that's hard to say in advance.  It really depends
>> on what the changes look like.  (1) and (2) sound harmless enough,
>> although (1) should probably only be done in conjunction with (4).
>> I'm not sure what (3) involves.  (5) sounds like a good idea.
>> (6) is worth doing, but anything ABI-related gets extra-paranoid
>> treatment. :-)
>
> The attached patch fixes (1) and (4). This makes mips64r5900el usable
> with r5900. If (4) is a problem (i.e. patching libgcc/Makefile.in), it
> would be good if at least (1) is accepted.

I can't approve the Makefile.in bits.  I've cc'ed Ian, who's the libgcc
maintainer.  Ian: the problem is that "_muldi3.o" on 64-bit targets
is actually an implementation of __multi3.  Jürgen wants to have a
__muldi3 too, with the same implementation as on 32-bit targets.

I think (1) and (4) should go in together though.  (1) doesn't make much
sense without a libgcc function to back it up.

> The patch for mips.md after line 1992 (adds TARGET_64BIT) is a more
> general fix. This is not needed for r5900 support, but I think this
> should be fixed.
> The same applies for patch after 2233 (adds ISA_HAS_DMULT). The fix here
> would be also adding TARGET_64BIT, but for r5900 we need ISA_HAS_DMULT
> here.

The current state is actually deliberate.  define_expand conditions are
only ever used in HAVE_* macros, so whatever we put there will not get
tested.  I think it's less confusing to have no test than an unused one,
just like we try not to have constraints in define_expands.

The other bits of the config/mips patch look good, thanks.  A couple of
formatting niggles:

> +/* ISA supports instructions dmult and dmultu. */
> +#define ISA_HAS_DMULT   (TARGET_64BIT
> \
> +  && !TARGET_MIPS5900)
> +
> +/* ISA supports instructions mult and multu.
> +   This always supported, but the macro is needed for ISA_HAS_MULT
> +   in mips.md.  */
> +#define ISA_HAS_MULT (1)
> +
> +/* ISA supports instructions ddiv and ddivu. */
> +#define ISA_HAS_DDIV(TARGET_64BIT
> \
> +  && !TARGET_MIPS5900)

Please keep ISA_HAS_DMULT and ISA_HAS_DDIV on one line while they fit.
I prefer caps for insn names in the comments, but the code isn't yet
as consistent as it should be, sorry...

> +/* ISA supports instructions div and divu.
> +   This always supported, but the macro is needed for ISA_HAS_DIV
> +   in mips.md.  */
> +#define ISA_HAS_DIV  (1)
> +
> +

Excess blank line here.

Thanks,
Richard


Re: [C++ Path] PR 38958

2013-06-12 Thread Jason Merrill

OK.

Jason


[PATCH 0/4] Fix leading and trailing whitespaces.

2013-06-12 Thread Ondřej Bílka
Hi,

I am writing a tool to fix common style issues.

This is first part which deals with leading and trailing whitespaces.
I can follow this up with other refactorings, for example rewriting
K&R definitions.

I wrote a simple programs that fixes them. 
Then it suffices for me or any volunteer to run them each month or so
and they will be gone for good.

These patches touch only gcc directory, If you want I could include
others but it is not clear for me what can I touch.

I split formatter to four simpler parts,
Even with this restriction these patches are big:

3487388 Jun 12 21:06 0001-Formatted-by-trailing_space.patch
 152911 Jun 12 21:06 0002-Formatted-by-form_feed.patch
 255018 Jun 12 21:06 0003-Formatted-by-space_before_tab.patch
6817050 Jun 12 21:06 0004-Formatted-by-leading_space.patch

You can verify that there are no bugs in my program and that 
git diff -w is empty.

A generator is at:

http://kam.mff.cuni.cz/~ondra/stylepp.tar.bz2

these patches were generated by sequence:

cd gcc
PATH_TO_STYLEPP/run_space

Thanks for considering.

Ondra


[PATCH, committed] Replaced abort and exit in 1 Cilk Plus Array Notation test

2013-06-12 Thread Iyer, Balaji V
Hello Everyone,
I replaced abort and exit in one of the Cilk Plus Array Notation test 
with return 1 and return 0, respectively. This patch (cut and pasted below) is 
committed as obvious.

Thanks,

Balaji V. Iyer.

Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 200037)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2013-06-12  Balaji V. Iyer  
+
+   * c-c++-common/cilk-plus/AN/sec_implicit_ex.c (main): Replaced abort
+   and exit function calls with return 1 and return 0, respectively.
+
 2013-06-12  Richard Sandiford  

* gcc.target/mips/umips-branch-1.c, gcc.target/mips/umips-branch-2.c:
Index: gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit_ex.c
===
--- gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit_ex.c   (revision 
200037)
+++ gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit_ex.c   (working copy)
@@ -1,10 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-fcilkplus" } */

-void abort (void);
-void exit  (int);
-
-
 int main(void)
 {
   int jj, kk, array_3C[10][10][10];
@@ -24,10 +20,7 @@
 for (jj = 0; jj < 10; jj++)
   for (kk = 0; kk < 10; kk++)
if (array_3[ii][jj][kk] != array_3C[ii][jj][kk])
- abort ();
+ return 1;

-
-  exit (0);
-
   return 0;
 }


Thanks,

Balaji V. Iyer.


[PATCH 2/4][RFC] Remove form feeds.

2013-06-12 Thread Ondřej Bílka
A second part of this cleanup is optional.

If you want to preserve form feeds its your decision, If you want to
remove them here is patch.

http://kam.mff.cuni.cz/~ondra/0002-Formatted-by-form_feed.patch



[PATCH 3/4] Fix space followed by tab

2013-06-12 Thread Ondřej Bílka
Now we move to leading spaces, 

If you want only to fix leading spaces followed by tab then please use
following patch

http://kam.mff.cuni.cz/~ondra/0003-Formatted-by-space_before_tab.patch




Re: [PATCH 4/4] Fix leading spaces.

2013-06-12 Thread Ondřej Bílka
A followup to previous patch is more general pass that changes leading
spaces to tabs followed by at most 8 spaces.

http://kam.mff.cuni.cz/~ondra/0004-Formatted-by-leading_space.patch



Re: Unordered container insertion hints

2013-06-12 Thread François Dumont

Hi

Any news regarding this patch ?

Thanks

François


On 06/06/2013 10:33 PM, François Dumont wrote:

On 05/24/2013 01:00 AM, Paolo Carlini wrote:

On 05/23/2013 10:01 PM, François Dumont wrote:

Some feedback regarding this patch ?
Two quick ones: what if the hint is wrong? I suppose the insertion 
succeeds anyway, it's only a little waste of time, right?


Right.

Is it possible that for instance something throws in that case and 
would not now (when the hint is simply ignored)? In case, check and 
re-check we are still conforming.
I consider the hint only if it is equivalent to the inserted element 
so I invoke the equal_to functor for that. The invocation of the 
equal_to functor is already done if no hint is granted at the same 
location. So usage of the hint has no impact on exception safety.


In any case, I think it's quite easy to notice if an implementation 
is using the hint in this way or a similar one basing on some simple 
benchmarks, without looking of course at the actual implementation 
code. Do we have any idea what other implementations are doing? Like, 
eg, they invented something for unordered_set and map too? Or a 
better way to exploit the hint for the multi variants?


I only bench llvm/clang implementation and notice no different 
with or without hint, I guess it is simply ignored. I haven't plan to 
check or bench other implementations. The usage of hint I am 
introducing is quite natural considering the new unordered containers 
data model. And if anyone has a better idea to deal with it then he is 
welcome to contribute !


Eventually I suppose we want to add a performance testcase to our 
testsuite.
Good request and the reason why it took me so long to answer. Writing 
such benchmark have shown me that users should be very careful with it 
cause it can do more bad than good.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o 
hint 120r  120u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with any hint 130r  130u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with good hint  54r   54u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with perfect hint  36r   36u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o 
hint  40r   40u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with any hint  38r   38u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with bad hint  49r   50u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with perfect hint  34r   35u0s 6416mem0pf


The small number represents how many time the same element is 
inserted and the big one the number of different elements. 100 X 2 
means that we loop 100 times inserting the 2 elements during each 
loop. 2 X 100 means that the main loop is on the elements and we 
insert each 100 times. Being able to insert all the equivalent 
elements at the same time or not has a major impact on the 
performances to get the same result. This is because when a new 
element is inserted it will be first in its bucket and the following 
99 insertions will benefit from it even without any hint.


The bench also show that a bad hint can be worst than no hint. A 
bad hint is one that once used require to check that next bucket is 
not impacted by the insertion. To do so it requires a hash code 
computation (if it is not cached like in my use case) and check. I 
have added a word about being able to check performance before using 
hints. Here is the result using the default std::hash, 
hash code is being cached.


unordered_multiset_hint.ccunordered_set 100 X 2 insertions w/o 
hint  76r   76u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with any hint  83r   83u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with good hint  29r   29u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 100 X 2 insertions 
with perfect hint  24r   23u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions w/o 
hint  27r   26u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with any hint  24r   24u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with bad hint  27r   27u0s 6416mem0pf
unordered_multiset_hint.ccunordered_set 2 X 100 insertions 
with perfect hint  23r   23u0s 6416mem0pf


Almost no impact in this case when using a bad hint. I consider adding 
another condition to the use of the hint which is to have the element 
after the hint also equivale

Re: More forwprop for vectors

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Marc Glisse wrote:


I suppose it's explicitely not allowing complex integer constants?


Hmm... Thanks, I keep forgetting complex :-(


And complex is even more of a pain than vector to handle.

Testing for CONSTANT_CLASS_P seems sufficient here. Some transformations also 
seem valid for complex, and the others are already restricted by the fact 
that they involve operators like AND or XOR, or because we exit early for 
FLOAT_TYPE_P and FIXED_POINT_TYPE_P. I'll test that (no new macro for now 
then).


Here is a new version. I added a build_all_ones_cst helper which currently 
only handles integer-like (complex, vector) types.


Bootstrap+testsuite on x86_64-unknown-linux-gnu as usual.

2013-06-13  Marc Glisse  

* tree-ssa-forwprop.c (simplify_bitwise_binary, associate_plusminus):
Generalize to complex and vector.
* tree.c (build_all_ones_cst): New function.
* tree.h (build_all_ones_cst): Declare it.

--
Marc GlisseIndex: tree-ssa-forwprop.c
===
--- tree-ssa-forwprop.c (revision 25)
+++ tree-ssa-forwprop.c (working copy)
@@ -1971,22 +1971,22 @@ simplify_bitwise_binary (gimple_stmt_ite
  gimple_assign_set_rhs2 (stmt, b);
  gimple_assign_set_rhs_code (stmt, def1_code);
  update_stmt (stmt);
  return true;
}
 }
 
   /* (a | CST1) & CST2  ->  (a & CST2) | (CST1 & CST2).  */
   if (code == BIT_AND_EXPR
   && def1_code == BIT_IOR_EXPR
-  && TREE_CODE (arg2) == INTEGER_CST
-  && TREE_CODE (def1_arg2) == INTEGER_CST)
+  && CONSTANT_CLASS_P (arg2)
+  && CONSTANT_CLASS_P (def1_arg2))
 {
   tree cst = fold_build2 (BIT_AND_EXPR, TREE_TYPE (arg2),
  arg2, def1_arg2);
   tree tem;
   gimple newop;
   if (integer_zerop (cst))
{
  gimple_assign_set_rhs1 (stmt, def1_arg1);
  update_stmt (stmt);
  return true;
@@ -2002,34 +2002,33 @@ simplify_bitwise_binary (gimple_stmt_ite
   gimple_assign_set_rhs_code (stmt, BIT_IOR_EXPR);
   update_stmt (stmt);
   return true;
 }
 
   /* Combine successive equal operations with constants.  */
   if ((code == BIT_AND_EXPR
|| code == BIT_IOR_EXPR
|| code == BIT_XOR_EXPR)
   && def1_code == code 
-  && TREE_CODE (arg2) == INTEGER_CST
-  && TREE_CODE (def1_arg2) == INTEGER_CST)
+  && CONSTANT_CLASS_P (arg2)
+  && CONSTANT_CLASS_P (def1_arg2))
 {
   tree cst = fold_build2 (code, TREE_TYPE (arg2),
  arg2, def1_arg2);
   gimple_assign_set_rhs1 (stmt, def1_arg1);
   gimple_assign_set_rhs2 (stmt, cst);
   update_stmt (stmt);
   return true;
 }
 
   /* Canonicalize X ^ ~0 to ~X.  */
   if (code == BIT_XOR_EXPR
-  && TREE_CODE (arg2) == INTEGER_CST
   && integer_all_onesp (arg2))
 {
   gimple_assign_set_rhs_with_ops (gsi, BIT_NOT_EXPR, arg1, NULL_TREE);
   gcc_assert (gsi_stmt (*gsi) == stmt);
   update_stmt (stmt);
   return true;
 }
 
   /* Try simple folding for X op !X, and X op X.  */
   res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2);
@@ -2472,73 +2471,74 @@ associate_plusminus (gimple_stmt_iterato
   && code != def_code)
{
  /* (A +- B) -+ B -> A.  */
  code = TREE_CODE (def_rhs1);
  rhs1 = def_rhs1;
  rhs2 = NULL_TREE;
  gimple_assign_set_rhs_with_ops (gsi, code, rhs1, NULL_TREE);
  gcc_assert (gsi_stmt (*gsi) == stmt);
  gimple_set_modified (stmt, true);
}
- else if (TREE_CODE (rhs2) == INTEGER_CST
-  && TREE_CODE (def_rhs1) == INTEGER_CST)
+ else if (CONSTANT_CLASS_P (rhs2)
+  && CONSTANT_CLASS_P (def_rhs1))
{
  /* (CST +- A) +- CST -> CST +- A.  */
  tree cst = fold_binary (code, TREE_TYPE (rhs1),
  def_rhs1, rhs2);
  if (cst && !TREE_OVERFLOW (cst))
{
  code = def_code;
  gimple_assign_set_rhs_code (stmt, code);
  rhs1 = cst;
  gimple_assign_set_rhs1 (stmt, rhs1);
  rhs2 = def_rhs2;
  gimple_assign_set_rhs2 (stmt, rhs2);
  gimple_set_modified (stmt, true);
}
}
- else if (TREE_CODE (rhs2) == INTEGER_CST
-  && TREE_CODE (def_rhs2) == INTEGER_CST
+ else if (CONSTANT_CLASS_P (rhs2)
+  && CONSTANT_CLASS_P (def_rhs2)
   && def_code == PLUS_EXPR)
{
  /* (A + CST) +- CST -> A + CST.  */
  tree cst = fold_binary (code, TREE_TYP

RE: [Bug libstdc++/56430] In __airy: return-statement with a value, in function returning 'void'.

2013-06-12 Thread 3dw4rd
Here is an overdue patch for the Airy function.
I repair the void function and I out two Airy functions as C++ extensions.

Built and tested on x86_64-linux.

OK?

Ed



CL_Airy
Description: Binary data


patch_Airy4
Description: Binary data


Re: [PATCH 0/4] Fix leading and trailing whitespaces.

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Ondřej Bílka wrote:


I am writing a tool to fix common style issues.

This is first part which deals with leading and trailing whitespaces.
I can follow this up with other refactorings, for example rewriting
K&R definitions.


Bonus points for asking first ;-)


2013-06-12   OndÅej BÃlka  

* gcc/alias.c: Formatted by trailing_space.
* gcc/asan.c: Likewise.


No gcc/ before the file names.


* gcc/cp/call.c: Likewise.


There is a separate ChangeLog for this directory (same for testsuite, etc).

On Wed, 12 Jun 2013, Ondřej Bílka wrote:


A followup to previous patch is more general pass that changes leading
spaces to tabs followed by at most 8 spaces.


s/at most/less than/

--
Marc Glisse


[Patch, fortran] PR 49074 ICE on defined assignment with class arrays.

2013-06-12 Thread Mikael Morin
Hello,

this is a fix for PR49074, where the temporary created by
gfc_conv_elemental_dependencies was leading to an ICE because it didn't
have the array reference expected by the scalarization code.

There was a bypass in gfc_conv_procedure_call avoiding exactly this
problem, but it is not reached when polymorphic entities are involved.
To avoid duplicating that, the patch proposed here adds support for null
references in gfc_conv_variable and removes the gfc_conv_procedure_call
bypass.  The patch also removes a useless reference walk in
gfc_conv_variable.

The test is the PR's; it's a runtime test as this area of the compiler
doesn't get much coverage from the test-suite.

Regression tested on x86_64-unknown-linux-gnu. OK for trunk?

Mikael


2013-06-12  Mikael Morin  

PR fortran/49074
* trans-expr.c (gfc_conv_variable): Don't walk the reference chain.
Handle NULL references.
(gfc_conv_procedure_call): Remove code handling NULL references.


diff --git a/trans-expr.c b/trans-expr.c
index 9d07345..bd8886c 100644
--- a/trans-expr.c
+++ b/trans-expr.c
@@ -1761,9 +1761,12 @@ gfc_conv_variable (gfc_se * se, gfc_expr * expr)
   /* A scalarized term.  We already know the descriptor.  */
   se->expr = ss_info->data.array.descriptor;
   se->string_length = ss_info->string_length;
-  for (ref = ss_info->data.array.ref; ref; ref = ref->next)
-   if (ref->type == REF_ARRAY && ref->u.ar.type != AR_ELEMENT)
- break;
+  ref = ss_info->data.array.ref;
+  if (ref)
+   gcc_assert (ref->type == REF_ARRAY
+   && ref->u.ar.type != AR_ELEMENT);
+  else
+   gfc_conv_tmp_array_ref (se);
 }
   else
 {
@@ -4041,23 +4044,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  gfc_init_se (&parmse, se);
  parm_kind = ELEMENTAL;
 
- if (ss->dimen > 0 && e->expr_type == EXPR_VARIABLE
- && ss->info->data.array.ref == NULL)
-   {
- gfc_conv_tmp_array_ref (&parmse);
- if (e->ts.type == BT_CHARACTER)
-   gfc_conv_string_parameter (&parmse);
- else
-   parmse.expr = gfc_build_addr_expr (NULL_TREE, parmse.expr);
-   }
- else
-   {
- gfc_conv_expr_reference (&parmse, e);
- if (e->ts.type == BT_CHARACTER && !e->rank
- && e->expr_type == EXPR_FUNCTION)
-   parmse.expr = build_fold_indirect_ref_loc (input_location,
-  parmse.expr);
-   }
+ gfc_conv_expr_reference (&parmse, e);
+ if (e->ts.type == BT_CHARACTER && !e->rank
+ && e->expr_type == EXPR_FUNCTION)
+   parmse.expr = build_fold_indirect_ref_loc (input_location,
+  parmse.expr);
 
  if (fsym && fsym->ts.type == BT_DERIVED
  && gfc_is_class_container_ref (e))
2013-06-12  Mikael Morin  

PR fortran/49074
* gfortran.dg/typebound_assignment_5.f03: New.
! { dg-do run }
!
! PR fortran/49074
! ICE on defined assignment with class arrays.

  module foo
type bar
  integer :: i

  contains

  generic :: assignment (=) => assgn_bar
  procedure, private :: assgn_bar
end type bar

contains

elemental subroutine assgn_bar (a, b)
  class (bar), intent (inout) :: a
  class (bar), intent (in) :: b

  select type (b)
  type is (bar)
a%i = b%i
  end select

  return
end subroutine assgn_bar
  end module foo

  program main
use foo

type (bar), allocatable :: foobar(:)

allocate (foobar(2))
foobar = [bar(1), bar(2)]
if (any(foobar%i /= [1, 2])) call abort
  end program



Re: [Patch, fortran] PR 49074 ICE on defined assignment with class arrays.

2013-06-12 Thread Tobias Burnus

Hello Mikael,

Mikael Morin wrote:

Regression tested on x86_64-unknown-linux-gnu. OK for trunk?


OK - looks good to me. The test case is also nice and a bit tricky, I 
tried it with three compilers: Two segfaulted at run time and only one 
passed the test.


Tobias


Re: More forwprop for vectors

2013-06-12 Thread Jeff Law

On 06/12/13 14:17, Marc Glisse wrote:

On Wed, 12 Jun 2013, Marc Glisse wrote:


I suppose it's explicitely not allowing complex integer constants?


Hmm... Thanks, I keep forgetting complex :-(


And complex is even more of a pain than vector to handle.


Testing for CONSTANT_CLASS_P seems sufficient here. Some
transformations also seem valid for complex, and the others are
already restricted by the fact that they involve operators like AND or
XOR, or because we exit early for FLOAT_TYPE_P and FIXED_POINT_TYPE_P.
I'll test that (no new macro for now then).


Here is a new version. I added a build_all_ones_cst helper which
currently only handles integer-like (complex, vector) types.

Bootstrap+testsuite on x86_64-unknown-linux-gnu as usual.

2013-06-13  Marc Glisse  

 * tree-ssa-forwprop.c (simplify_bitwise_binary, associate_plusminus):
 Generalize to complex and vector.
 * tree.c (build_all_ones_cst): New function.
 * tree.h (build_all_ones_cst): Declare it.

This is OK.

Extra credit if you create some testcases.


Thanks,
Jeff



Re: RFC [MIPS, RS6000] Mangling of IBM long double template literals

2013-06-12 Thread Richard Sandiford
"Joseph S. Myers"  writes:
> On Thu, 13 Jun 2013, Alan Modra wrote:
>> This is of course an ABI change for any existing little-endian users
>> of IBM long double literals in templates.  On powerpc, I think we can
>> safely say there are no such users.  However it does look like MIPS
>> also uses a variant of IBM long double, and I'm less certain there.
>
> That variant was used for IRIX, for which support was removed over a year 
> ago.  It would be good to remove the code that I noted in 
>  was thereby made 
> obsolete.

OK, I'll try to do that sometime.  And as far as this change goes,
IRIX was big-endian only anyway, so it wouldn't have been affected.
Thanks Alan for checking though.

Richard


Re: More forwprop for vectors

2013-06-12 Thread Marc Glisse

On Wed, 12 Jun 2013, Jeff Law wrote:


On 06/12/13 14:17, Marc Glisse wrote:

On Wed, 12 Jun 2013, Marc Glisse wrote:


I suppose it's explicitely not allowing complex integer constants?


Hmm... Thanks, I keep forgetting complex :-(


And complex is even more of a pain than vector to handle.


Testing for CONSTANT_CLASS_P seems sufficient here. Some
transformations also seem valid for complex, and the others are
already restricted by the fact that they involve operators like AND or
XOR, or because we exit early for FLOAT_TYPE_P and FIXED_POINT_TYPE_P.
I'll test that (no new macro for now then).


Here is a new version. I added a build_all_ones_cst helper which
currently only handles integer-like (complex, vector) types.

Bootstrap+testsuite on x86_64-unknown-linux-gnu as usual.

2013-06-13  Marc Glisse  

 * tree-ssa-forwprop.c (simplify_bitwise_binary, associate_plusminus):
 Generalize to complex and vector.
 * tree.c (build_all_ones_cst): New function.
 * tree.h (build_all_ones_cst): Declare it.

This is OK.


Thanks.


Extra credit if you create some testcases.


I'll try to add one, but the most interesting one would involve a 
BIT_NOT_EXPR of a complex of integers, and I don't have any idea how to 
create that (~ means CONJ_EXPR as a gcc extension), or if it is even 
supposed to be legal.


--
Marc Glisse


Re: [Bug libstdc++/56430] In __airy: return-statement with a value, in function returning 'void'.

2013-06-12 Thread Paolo Carlini

Hi,

On 06/12/2013 10:28 PM, 3dw...@verizon.net wrote:

Here is an overdue patch for the Airy function.
I repair the void function and I out two Airy functions as C++ extensions.

Built and tested on x86_64-linux.

OK?

The functions are unused, please remove them and close the PR.

Thanks,
Paolo.


[committed] Fix mips/memcpy-1.c after DATA_ALIGNMENT change

2013-06-12 Thread Richard Sandiford
Make the array non-common so that it is treated as binding to the
current TU.

Tested on mipsisa32-sde-elf and applied.

Richard


gcc/testsuite/
* gcc.target/mips/mips.exp: Handle -f{no-,}common.
* gcc.target/mips/memcpy-1.c: Remove redundant dg-do.
Run with -fno-common.

Index: gcc/testsuite/gcc.target/mips/mips.exp
===
--- gcc/testsuite/gcc.target/mips/mips.exp  2013-03-20 21:01:23.362615041 
+
+++ gcc/testsuite/gcc.target/mips/mips.exp  2013-06-12 22:26:44.728021026 
+0100
@@ -286,6 +286,7 @@ foreach option {
 
 # Add -ffoo/-fno-foo options to mips_option_groups.
 foreach option {
+common
 delayed-branch
 expensive-optimizations
 fast-math
Index: gcc/testsuite/gcc.target/mips/memcpy-1.c
===
--- gcc/testsuite/gcc.target/mips/memcpy-1.c2012-08-27 17:27:13.0 
+0100
+++ gcc/testsuite/gcc.target/mips/memcpy-1.c2013-06-12 22:26:09.612659202 
+0100
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-options "-fno-common" } */
 /* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
 /* { dg-final { scan-assembler-not "\tlbu\t" } } */
 


Re: [Debug/Fortran] PR37132 - RFC/RFA - support DW_TAG_namelist

2013-06-12 Thread Mikael Morin
Hello,

Le 10/06/2013 22:40, Tobias Burnus a écrit :
> My problem: I do not see where one can best handle the  namelist for
> modules. One possibility would be gen_namespace_die - but that would
> come before the dies of all VAR_DECLs used in the namelist have been
> created. And the code seems to assume that the decl is not emitted,
> hence, one cannot simply use force_decl_die in gen_namespace_die. - One
> possibility would be to add a lookup_decl_die() check (e.g. in
> dwarf2out_global_decl) and to use force_decl_die, but I don't know
> whether that's a good approach.
> 
> Suggestions?
> 
I'm not at all familiar with the code paths for debug info generation,
but wouldn't it work if, starting from the first patch, you moved the
call to
  gfc_traverse_ns (ns, generate_namelist_decl)
after this:
  gfc_traverse_ns (ns, gfc_emit_parameter_debug_info);
in the two places where the latter appears?

Is the new tree code the preferred way?  A namelist feels too
fortran-specific to me to deserve its own middle-end decl code.

Mikael


Re: [gomp4] Some progress on #pragma omp simd

2013-06-12 Thread Jakub Jelinek
On Wed, Jun 12, 2013 at 10:38:00AM -0700, Richard Henderson wrote:
> On 06/12/2013 10:30 AM, Jakub Jelinek wrote:
> > So the built-ins would take address of this decl, something else?
> 
> Perhaps address, perhaps just referenced uninitialized?

True, assuming no pass would actually want to change that SSA_NAME of the
magic decl just because it is undefined (coalesce with some other undefined
SSA_NAME or something similar).  I hope nothing does that, it would be
problematic for the uninitialized warning pass too I bet.

> But that simd_uid clause refers to the same decl as the builtins, so the
> builtins should keep the decl around, at least until they themselves are
> transformed.  At which point the decl is no longer needed, no?
> 
> Indeed, I am really hoping that the decl vanishes completely before rtl.

Sure, it certainly should go away at the end of vectorization (and, when we
know vectorization won't happen we just should assume safelen will be 1).

Jakub


Re: patch to fix PR57559 for s390

2013-06-12 Thread Richard Sandiford
Vladimir Makarov  writes:
> Index: lra.c
> ===
> --- lra.c (revision 199753)
> +++ lra.c (working copy)
> @@ -306,11 +306,11 @@ lra_emit_add (rtx x, rtx y, rtx z)
> || (disp != NULL_RTX && ! CONSTANT_P (disp))
> || (scale != NULL_RTX && ! CONSTANT_P (scale)))
>   {
> -   /* Its is not an address generation.  Probably we have no 3 op
> +   /* It is not an address generation.   Probably we have no 3 op
>add.  Last chance is to use 2-op add insn.  */
> lra_assert (x != y && x != z);
> -   emit_move_insn (x, z);
> -   insn = gen_add2_insn (x, y);
> +   emit_move_insn (x, y);
> +   insn = gen_add2_insn (x, z);
> emit_insn (insn);
>   }
>else

Could you add a comment to lra_emit_add saying why it has to be this
way round (move y, add z)?

Thanks,
Richard


[PATCH] PR57518, RA generated redundent code

2013-06-12 Thread Wei Mi
Hi,

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57518

pr57518 happened because update_equiv_regs in IRA marked a reg
equivalent with a mem, lowered its mem_cost in scan_one_insn, set
NO_REGS to its rclass, but didn't consider the reg was used in
paradoxical subreg which prevented the reg from being replaced by mem
in LRA phase.

This patch is to check whether a reg is used in a paradoxical subreg
in update_equiv_regs before reg is set as equivalent to a mem.

bootstrap and regression test on x86_64-linux-gnu ok. Is it ok for
trunk and gcc-4.8 branch?

Thanks,
Wei.


changelog
Description: Binary data


patch
Description: Binary data


Re: [PATCH, rs6000] power8 patches, patch #7, quad/byte/half-word atomic instructions

2013-06-12 Thread David Edelsohn
On Tue, Jun 11, 2013 at 7:53 PM, Michael Meissner
 wrote:
> I needed to rework the sync.md so that it would work correctly with no
> optimization (using SUBREG's at -O0 did not give us the even registers for
> holding PTImode values, so I created a PTImode temporary in load_lockedti and
> store_conditionalti, which is normally optimized out.
>
> [gcc]
> 2013-06-11  Michael Meissner  
> Pat Haugen 
> Peter Bergner 
>
> * config/rs6000/rs6000.c (emit_load_locked): Add support for
> power8 byte, half-word, and quad-word atomic instructions.
> (emit_store_conditional): Likewise.
> (rs6000_expand_atomic_compare_and_swap): Likewise.
> (rs6000_expand_atomic_op): Likewise.
>
> * config/rs6000/sync.md (larx): Add new modes for power8.
> (stcx): Likewise.
> (AINT): New mode iterator to include TImode as well as normal
> integer modes on power8.
> (fetchop_pred): Use int_reg_operand instead of gpc_reg_operand so
> that VSX registers are not considered.  Use AINT mode iterator
> instead of INT1 to allow inclusion of quad word atomic operations
> on power8.
> (load_locked): Likewise.
> (store_conditional): Likewise.
> (atomic_compare_and_swap): Likewise.
> (atomic_exchange): Likewise.
> (atomic_nand): Likewise.
> (atomic_fetch_): Likewise.
> (atomic_nand_fetch): Likewise.
> (mem_thread_fence): Use gen_loadsync_ instead of enumerating
> each type.
> (ATOMIC): On power8, add QImode, HImode modes.
> (load_locked_si): Varients of load_locked for QI/HI
> modes that promote to SImode.
> (load_lockedti): Convert TImode arguments to PTImode, so that we
> get a guaranteed even/odd register pair.
> (load_lockedpti): Likewise.
> (store_conditionalti): Likewise.
> (store_conditionalpti): Likewise.
>
> * config/rs6000/rs6000.md (QHI): New mode iterator for power8
> atomic load/store instructions.
> (HSI): Likewise.
>
> [gcc/testsuite]
> 2013-06-11  Michael Meissner  
> Pat Haugen 
> Peter Bergner 
>
> * gcc.target/powerpc/atomic-p7.c: New file, add tests for atomic
> load/store instructions on power7, power8.
> * gcc.target/powerpc/atomic-p8.c: Likewise.
>
> Given these changes went beyond the original request to fix a spelling error
> and improve the logic, I figured to send these patches out again.  David, do
> you have any problem with the new patches?

The new patches are okay.  Thanks for re-checking.

Thanks, David


[patch] reimplement -fstrict-volatile-bitfields

2013-06-12 Thread Sandra Loosemore
Background:  on ARM and some other targets, the ABI requires that 
volatile bit-fields be accessed atomically in a mode that corresponds to 
the declared type of the field, which conflicts with GCC's normal 
behavior of doing accesses in a mode that might correspond to the size 
of a general-purpose register, the size of the bit-field, or the bit 
range corresponding to the C++ memory model.  This is what the 
-fstrict-volatile-bitfields flag does, and it is the default on ARM and 
other targets where the ABI requires this behavior.


Both the original patch that added -fstrict-volatile-bitfields support 
and a subsequent followup patch that tried to unbreak handling of packed 
structures (where fields might not be properly aligned to do the single 
access otherwise mandated by -fstrict-volatile-bitfields) only handled 
bit-field reads, and not writes.  Last year I submitted a patch we've 
had locally for some time to extend the existing implementation to 
writes, but it was rejected on the grounds that the current 
implementation is too broken to fix or extend.  I didn't have time then 
to start over from scratch, and meanwhile, the bug reports about 
-fstrict-volatile-bitfields have continued to pile up.  So let's try 
again to fix this, this time working from the ground up.


From last year's discussion, it seemed that there were two primary 
objections to the current implementation:


(1) It was seen as inappropriate that warnings about conflicts between 
unaligned fields and -fstrict-volatile-bitfields were being emitted 
during expand.  It was suggested that any diagnostics ought to be 
emitted by the various language front ends instead.


(2) The way packed fields are being detected is buggy and an abstraction 
violation, and passing around a packedp flag to all the bit-field expand 
functions is ugly.


And, my own complaints about the current implementation:

(3) Users expect packed structures to work even on targets where 
-fstrict-volatile-bitfields is the default, so the compiler shouldn't 
generate code for accesses to unaligned fields that either faults at run 
time due to the unaligned access or silently produces an incorrect 
result (e.g., by only accessing part of the bit-field), with or without 
a warning at compile time.


(4) There's pointless divergence between the bit-field store and extract 
code that has led to a number of bugs.


I've come up with a new patch that tries to address all these issues.

For problem (1), I've eliminated the warnings from expand.  I'm not 
opposed to adding them back to the front ends, as previously suggested, 
but given that they were only previously implemented for reads and not 
writes and that it was getting the different-warning-for-packed-fields 
behavior wrong in some cases, getting rid of the warnings is at least as 
correct as adding them for bit-field writes, too.  ;-)


I've killed the packedp flag from item (2) completely too.

For problem (3), my reading of the ARM ABI document is that the 
requirements for atomic access to volatile bit-fields only apply to 
bit-fields that are aligned according to the ABI requirements.  If a 
user has used GCC extensions to create a non-ABI-compliant packed 
structure where an atomic bit-field access of the correct size is not 
possible or valid on the target, then GCC ought to define some 
reasonable access behavior for those bit-fields too as a further 
extension -- whether or not it complies with the ABI requirements for 
unpacked structures -- rather than just generating invalid code.  In 
particular,  generating the access using whatever technique it would 
fall back to if -fstrict-volatile-bitfields didn't apply, in cases where 
it *cannot* be applied, seems perfectly reasonable to me.


To address problem (4), I've tried to make the code for handling 
-fstrict-volatile-bitfields similar in the read and write cases.  I 
think there is probably more that can be done here in terms of 
refactoring some of the now-common code and changing the interfaces to 
be more consistent as well, but I think it would be more clear to 
separate changes that are just code cleanup from those that are intended 
to change behavior.  I'm willing to work on code refactoring as a 
followup patch if the maintainers recommend or require that.


I've regression tested the attached patch on arm-none-eabi as well as 
bootstrapping and regression testing on x86_64-linux-gnu.  I also did 
some spot testing on mipsisa32r2-sde-elf.  I verified that all the new 
test cases pass on these targets with this patch.  Without the patch, I 
saw these failures:


pr23623.c: ARM, x86_64, MIPS
pr48784-1.c: ARM, x86_64, MIPS
pr48784-2.c: none
pr56341-1.c: ARM, MIPS
pr56341-2.c: none
pr56997-1.c: ARM
pr56997-2.c: ARM, MIPS
pr56997-3.c: ARM, MIPS
volatile-bitfields-3.c: ARM, x86_64, MIPS

Here are some comments on specific parts of the patch.

The "meat" of the patch is rewriting the logic for 
-fstrict-volatile-bitfields in extract_fixed_bit_fiel

  1   2   >