Re: [PATCH] Fix up cmove expansion (PR target/58864)

2013-11-30 Thread Richard Biener
Jakub Jelinek  wrote:
>Hi!
>
>The following testcase ICEs because expand_cond_expr_using_cmove
>calls emit_conditional_move (which calls do_pending_stack_adjust
>under some circumstances), but when that fails, just removes all the
>insns generated by emit_conditional_move (and perhaps some earlier ones
>too), thus it removes also the stack adjustment.
>
>Apparently 2 similar places were fixing it by just calling
>do_pending_stack_adjust () first just in case, some other places
>had (most likely) the same bug as this function.
>
>Rather than adding do_pending_stack_adjust () in all the places,
>especially
>when it isn't clear whether emit_conditional_move will be called at all
>and
>whether it will actually do do_pending_stack_adjust (), I chose to add
>two new functions to save/restore the pending stack adjustment state,
>so that when instruction sequence is thrown away (either by doing
>start_sequence/end_sequence around it and not emitting it, or
>delete_insns_since) the state can be restored, and have changed all the
>places that IMHO need it for emit_conditional_move.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/4.8?

The idea is good but I'd like to see a struct rather than an array for the 
storage.

Thanks,
Richard.


>2013-11-29  Jakub Jelinek  
>
>   PR target/58864
>   * dojump.c (save_pending_stack_adjust, restore_pending_stack_adjust):
>   New functions.
>   * expr.h (save_pending_stack_adjust, restore_pending_stack_adjust):
>   New prototypes.
>   * expr.c (expand_cond_expr_using_cmove): Use it.
>   (expand_expr_real_2): Use it instead of unconditional
>   do_pending_stack_adjust.
>   * optabs.c (expand_doubleword_shift): Use it.
>   * expmed.c (expand_sdiv_pow2): Use it instead of unconditional
>   do_pending_stack_adjust.
>   (emit_store_flag): Use it.
>
>   * g++.dg/opt/pr58864.C: New test.
>
>--- gcc/expr.c.jj  2013-11-27 18:02:46.0 +0100
>+++ gcc/expr.c 2013-11-29 14:35:12.234808484 +0100
>@@ -7951,6 +7951,9 @@ expand_cond_expr_using_cmove (tree treeo
>   else
> temp = assign_temp (type, 0, 1);
> 
>+  int save[2];
>+  save_pending_stack_adjust (save);
>+
>   start_sequence ();
>   expand_operands (treeop1, treeop2,
>  temp, &op1, &op2, EXPAND_NORMAL);
>@@ -8009,6 +8012,7 @@ expand_cond_expr_using_cmove (tree treeo
>   /* Otherwise discard the sequence and fall back to code with
>  branches.  */
>   end_sequence ();
>+  restore_pending_stack_adjust (save);
> #endif
>   return NULL_RTX;
> }
>@@ -8789,12 +8793,9 @@ expand_expr_real_2 (sepops ops, rtx targ
>   if (can_conditionally_move_p (mode))
> {
>   rtx insn;
>+  int save[2];
> 
>-  /* ??? Same problem as in expmed.c: emit_conditional_move
>- forces a stack adjustment via compare_from_rtx, and we
>- lose the stack adjustment if the sequence we are about
>- to create is discarded.  */
>-  do_pending_stack_adjust ();
>+  save_pending_stack_adjust (save);
> 
>   start_sequence ();
> 
>@@ -8817,6 +8818,7 @@ expand_expr_real_2 (sepops ops, rtx targ
>   /* Otherwise discard the sequence and fall back to code with
>  branches.  */
>   end_sequence ();
>+  restore_pending_stack_adjust (save);
> }
> #endif
>   if (target != op0)
>--- gcc/optabs.c.jj2013-11-19 21:56:22.0 +0100
>+++ gcc/optabs.c   2013-11-29 14:39:15.963513835 +0100
>@@ -1079,17 +1079,20 @@ expand_doubleword_shift (enum machine_mo
> 
> #ifdef HAVE_conditional_move
>   /* Try using conditional moves to generate straight-line code.  */
>-  {
>-rtx start = get_last_insn ();
>-if (expand_doubleword_shift_condmove (op1_mode, binoptab,
>-cmp_code, cmp1, cmp2,
>-outof_input, into_input,
>-op1, superword_op1,
>-outof_target, into_target,
>-unsignedp, methods, shift_mask))
>-  return true;
>-delete_insns_since (start);
>-  }
>+  int save[2];
>+
>+  save_pending_stack_adjust (save);
>+
>+  rtx start = get_last_insn ();
>+  if (expand_doubleword_shift_condmove (op1_mode, binoptab,
>+  cmp_code, cmp1, cmp2,
>+  outof_input, into_input,
>+  op1, superword_op1,
>+  outof_target, into_target,
>+  unsignedp, methods, shift_mask))
>+return true;
>+  delete_insns_since (start);
>+  restore_pending_stack_adjust (save);
> #endif
> 
>/* As a last resort, use branches to select the correct alternative. 
>*/
>--- gcc/dojump.c.jj2013-11-19 21:56:27.0 +0100
>+++ gcc/dojump.c   2013-11-29 14:35:35.088685749 +0100
>@@ -96,6 +96,29

*ping* Re: wwwdocs: Broken links due to the preprocess script

2013-11-30 Thread Tobias Burnus

On October 25, 2013 22:32, Tobias Burnus wrote:

Tobias Burnus wrote:
Thanks for looking at the patch. However, the patch has a link 
problem. The documentation is at

http://gcc.gnu.org/onlinedocs/gcc/Loop_002dSpecific-Pragmas.html

That's also the link I use in the changes.html file. However, some 
script changes the link to:

   http://gcc.gnu.org/onlinedocs/gcc/Loop-Specific-Pragmas.html
which won't work. Try yourself at 
http://gcc.gnu.org/gcc-4.9/changes.html



Actually, a similar issue was reported at 
http://gcc.gnu.org/ml/gcc-help/2013-10/msg00132.html


The reason for the broken links are the following lines in the 
/www/bin/preprocess script: 
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/bin/preprocess.diff?r1=1.38&r2=1.39&f=h


Gerald, do you still know why you added it 9 years ago? The commit 
comment is "Use sed to work around makeinfo 4.7 brokenness."


I think "makeinfo" is still broken, but those pages do not seem to go 
through the preprocess script, which means that only links to that 
page will change to a hyphen, breaking the links.


Do you think it would be sensible to remove those lines again - or, 
alternatively, to run a similar script (e.g. "perl -i -e 's/_002d/-/g' 
`find onlinedocs -name \*.html`) on the onlinedocs/.


I think the impact of the the former on links is smaller. (One still 
needs to re-run the script on those files to restore the links.)


Tobias





*ping* Re: gcc/invoke.texi: Add missing @opindex

2013-11-30 Thread Tobias Burnus

Tobias Burnus wrote:

Tobias Burnus wrote:
While looking at the index for -fsanitize=, I found out that it – and 
many other options – lack the @opindex. Attached is an attempted to 
add the missing ones.


Updated patch: I also observed some odd "*<-fsanitize=null>" output in 
the man page; Manuel suggested a fix which indeed works (using 
@gcctabopt), which I now also include.


OK for the trunk?

Tobias




Add TREE_INT_CST_OFFSET_NUNITS

2013-11-30 Thread Richard Sandiford
So maybe two INTEGER_CST lengths weren't enough.  Because bitsizetype
can be offset_int-sized, wi::to_offset had a TYPE_PRECISION condition
to pick the array length:

template 
inline unsigned int
wi::extended_tree ::get_len () const
{
  if (N == MAX_BITSIZE_MODE_ANY_INT
  || N > TYPE_PRECISION (TREE_TYPE (m_t)))
return TREE_INT_CST_EXT_NUNITS (m_t);
  else
return TREE_INT_CST_NUNITS (m_t);
}

and this TYPE_PRECISION condition was relatively hot in
get_ref_base_and_extent when compiling insn-recog.ii.

Adding a third length for offset_int does seem to reduce the cost of
the offset_int + to_offset addition.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/ChangeLog.wide-int
===
--- gcc/ChangeLog.wide-int  2013-11-30 09:31:16.359198395 +
+++ gcc/ChangeLog.wide-int  2013-11-30 09:41:50.987741444 +
@@ -616,6 +616,7 @@
(TREE_INT_CST_HIGH): Delete.
(TREE_INT_CST_NUNITS): New.
(TREE_INT_CST_EXT_NUNITS): Likewise.
+   (TREE_INT_CST_OFFSET_NUNITS): Likewise.
(TREE_INT_CST_ELT): Likewise.
(INT_CST_LT): Use wide-int interfaces.
(INT_CST_LE): New.
Index: gcc/tree-core.h
===
--- gcc/tree-core.h 2013-11-30 09:31:16.359198395 +
+++ gcc/tree-core.h 2013-11-30 09:41:12.011470169 +
@@ -764,11 +764,16 @@ struct GTY(()) tree_base {
 struct {
   /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
 its native precision.  */
-  unsigned short unextended;
+  unsigned char unextended;
 
   /* The number of HOST_WIDE_INTs if the INTEGER_CST is extended to
 wider precisions based on its TYPE_SIGN.  */
-  unsigned short extended;
+  unsigned char extended;
+
+  /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
+offset_int precision, with smaller integers being extended
+according to their TYPE_SIGN.  */
+  unsigned char offset;
 } int_length;
 
 /* VEC length.  This field is only used with TREE_VEC.  */
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-30 09:31:16.359198395 +
+++ gcc/tree.c  2013-11-30 09:41:42.965685621 +
@@ -1285,6 +1285,7 @@ wide_int_to_tree (tree type, const wide_
/* Make sure no one is clobbering the shared constant.  */
gcc_checking_assert (TREE_TYPE (t) == type
 && TREE_INT_CST_NUNITS (t) == 1
+&& TREE_INT_CST_OFFSET_NUNITS (t) == 1
 && TREE_INT_CST_EXT_NUNITS (t) == 1
 && TREE_INT_CST_ELT (t, 0) == hwi);
  else
@@ -1964,6 +1965,7 @@ make_int_cst_stat (int len, int ext_len
   TREE_SET_CODE (t, INTEGER_CST);
   TREE_INT_CST_NUNITS (t) = len;
   TREE_INT_CST_EXT_NUNITS (t) = ext_len;
+  TREE_INT_CST_OFFSET_NUNITS (t) = MIN (ext_len, OFFSET_INT_ELTS);
 
   TREE_CONSTANT (t) = 1;
 
Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-30 09:31:16.359198395 +
+++ gcc/tree.h  2013-11-30 09:41:29.418591391 +
@@ -907,6 +907,8 @@ #define TREE_INT_CST_NUNITS(NODE) \
   (INTEGER_CST_CHECK (NODE)->base.u.int_length.unextended)
 #define TREE_INT_CST_EXT_NUNITS(NODE) \
   (INTEGER_CST_CHECK (NODE)->base.u.int_length.extended)
+#define TREE_INT_CST_OFFSET_NUNITS(NODE) \
+  (INTEGER_CST_CHECK (NODE)->base.u.int_length.offset)
 #define TREE_INT_CST_ELT(NODE, I) TREE_INT_CST_ELT_CHECK (NODE, I)
 #define TREE_INT_CST_LOW(NODE) \
   ((unsigned HOST_WIDE_INT) TREE_INT_CST_ELT (NODE, 0))
@@ -4623,8 +4625,10 @@ wi::extended_tree ::get_val () const
 inline unsigned int
 wi::extended_tree ::get_len () const
 {
-  if (N == MAX_BITSIZE_MODE_ANY_INT
-  || N > TYPE_PRECISION (TREE_TYPE (m_t)))
+  if (N == ADDR_MAX_PRECISION)
+return TREE_INT_CST_OFFSET_NUNITS (m_t);
+  else if (N == MAX_BITSIZE_MODE_ANY_INT
+  || N > TYPE_PRECISION (TREE_TYPE (m_t)))
 return TREE_INT_CST_EXT_NUNITS (m_t);
   else
 return TREE_INT_CST_NUNITS (m_t);
Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2013-11-30 09:31:16.359198395 +
+++ gcc/wide-int.h  2013-11-30 09:40:32.710196218 +
@@ -256,6 +256,9 @@ #define ADDR_MAX_BITSIZE 64
 #define ADDR_MAX_PRECISION \
   ((ADDR_MAX_BITSIZE + 4 + HOST_BITS_PER_WIDE_INT - 1) & 
~(HOST_BITS_PER_WIDE_INT - 1))
 
+/* The number of HWIs needed to store an offset_int.  */
+#define OFFSET_INT_ELTS (ADDR_MAX_PRECISION / HOST_BITS_PER_WIDE_INT)
+
 /* The type of result produced by a binary operation on types T1 and T2.
Defined purely for brevity.  */
 #define WI_BINARY_RESULT(T1, T2) \


[wide-int] Avoid some temporaries and use shifts more often

2013-11-30 Thread Richard Sandiford
This started out as an another attempt to find places where we had
things like:

   offset_int x = wi::to_offset (...);
   x = ...x...;

and change them to:

   offset_int x = ...wi::to_offset (...)...;

with the get_ref_base_and_extent case being the main one.
But it turned out that some of them were also multiplying or
dividing by BITS_PER_UNIT, so it ended up also being a patch to
convert those to shifts.

I didn't want to cut-&-paste the 3 : log2 (BITS_PER_UNIT) conditional
yet more times, so I added a LOG2_BITS_PER_UNIT to defaults.h.  I can
retrofit it to the existing code if that's OK at this stage.

For insn-recog.ii this reduces the number of divmod_internal calls from
7884858 to 369746.

Thanks,
Richard


Index: gcc/ChangeLog.wide-int
===
--- gcc/ChangeLog.wide-int  2013-11-29 15:09:59.623293132 +
+++ gcc/ChangeLog.wide-int  2013-11-29 15:11:48.611155898 +
@@ -111,6 +111,7 @@
(stabstr_U): Use wide-int interfaces.
(dbxout_type): Update to use cst_fits_shwi_p.
* defaults.h
+   (LOG2_BITS_PER_UNIT): Define.
(TARGET_SUPPORTS_WIDE_INT): Add default.
* dfp.c: Include wide-int.h.
(decimal_real_to_integer2): Use wide-int interfaces and rename to
Index: gcc/alias.c
===
--- gcc/alias.c 2013-11-29 15:04:41.136142237 +
+++ gcc/alias.c 2013-11-29 15:11:48.606155857 +
@@ -2355,8 +2355,8 @@ adjust_offset_for_component_ref (tree x,
 
   offset_int woffset
= (wi::to_offset (xoffset)
-  + wi::udiv_trunc (wi::to_offset (DECL_FIELD_BIT_OFFSET (field)),
-BITS_PER_UNIT));
+  + wi::lrshift (wi::to_offset (DECL_FIELD_BIT_OFFSET (field)),
+ LOG2_BITS_PER_UNIT));
   if (!wi::fits_uhwi_p (woffset))
{
  *known_p = false;
Index: gcc/defaults.h
===
--- gcc/defaults.h  2013-11-29 15:04:41.136142237 +
+++ gcc/defaults.h  2013-11-29 15:11:48.606155857 +
@@ -475,6 +475,14 @@ #define DWARF_TYPE_SIGNATURE_SIZE 8
 #define BITS_PER_UNIT 8
 #endif
 
+#if BITS_PER_UNIT == 8
+#define LOG2_BITS_PER_UNIT 3
+#elif BITS_PER_UNIT == 16
+#define LOG2_BITS_PER_UNIT 4
+#else
+#error Unknown BITS_PER_UNIT
+#endif
+
 #ifndef BITS_PER_WORD
 #define BITS_PER_WORD (BITS_PER_UNIT * UNITS_PER_WORD)
 #endif
Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c 2013-11-29 15:04:41.136142237 +
+++ gcc/dwarf2out.c 2013-11-29 15:40:56.188806688 +
@@ -14930,7 +14930,7 @@ field_byte_offset (const_tree decl)
 object_offset_in_bits = bitpos_int;
 
   object_offset_in_bytes
-= wi::udiv_trunc (object_offset_in_bits, BITS_PER_UNIT);
+= wi::lrshift (object_offset_in_bits, LOG2_BITS_PER_UNIT);
   return object_offset_in_bytes.to_shwi ();
 }
 
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   2013-11-29 15:04:41.136142237 +
+++ gcc/gimple-fold.c   2013-11-29 15:41:17.425983303 +
@@ -2926,7 +2926,6 @@ fold_nonarray_ctor_reference (tree type,
   tree field_offset = DECL_FIELD_BIT_OFFSET (cfield);
   tree field_size = DECL_SIZE (cfield);
   offset_int bitoffset;
-  offset_int byte_offset_cst = wi::to_offset (byte_offset);
   offset_int bitoffset_end, access_end;
 
   /* Variable sized objects in static constructors makes no sense,
@@ -2939,7 +2938,8 @@ fold_nonarray_ctor_reference (tree type,
 
   /* Compute bit offset of the field.  */
   bitoffset = (wi::to_offset (field_offset)
-  + byte_offset_cst * BITS_PER_UNIT);
+  + wi::lshift (wi::to_offset (byte_offset),
+LOG2_BITS_PER_UNIT));
   /* Compute bit offset where the field ends.  */
   if (field_size != NULL_TREE)
bitoffset_end = bitoffset + wi::to_offset (field_size);
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c 2013-11-29 15:04:41.136142237 +
+++ gcc/gimple-ssa-strength-reduction.c 2013-11-29 15:40:56.188806688 +
@@ -897,7 +897,7 @@ restructure_reference (tree *pbase, tree
   c2 = 0;
 }
 
-  c4 = wi::udiv_floor (index, BITS_PER_UNIT);
+  c4 = wi::lrshift (index, LOG2_BITS_PER_UNIT);
   c5 = backtrace_base_for_ref (&t2);
 
   *pbase = t1;
Index: gcc/tree-dfa.c
===
--- gcc/tree-dfa.c  2013-11-29 15:04:41.136142237 +
+++ gcc/tree-dfa.c  2013-11-29 15:41:39.513166464 +
@@ -437,10 +437,8 @@ get_ref_base_and_extent (tree exp, HOST_
 
if (this_offset && TREE_CODE (this_offset) == INTEGER_CST)
  {
-   offset_int woffset = wi::to_offset (this_o

[wide-int] Use __builtin_expect for length checks

2013-11-30 Thread Richard Sandiford
Without profiling information, GCC tends to assume "x == 1" and
"x + y == 2" are likely false, so this patch adds some __builtin_expects.
(system.h has a dummy definition for compilers that don't support
__builtin_expect.)

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2013-11-30 09:40:32.710196218 +
+++ gcc/wide-int.h  2013-11-30 10:07:06.567433289 +
@@ -1675,7 +1675,7 @@ wi::eq_p (const T1 &x, const T2 &y)
   while (++i != xi.len);
   return true;
 }
-  if (yi.len == 1)
+  if (__builtin_expect (yi.len == 1, true))
 {
   /* XI is only equal to YI if it too has a single HWI.  */
   if (xi.len != 1)
@@ -1751,7 +1751,7 @@ wi::ltu_p (const T1 &x, const T2 &y)
   /* Optimize the case of two HWIs.  The HWIs are implicitly sign-extended
  for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both
  values does not change the result.  */
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   unsigned HOST_WIDE_INT xl = xi.to_uhwi ();
   unsigned HOST_WIDE_INT yl = yi.to_uhwi ();
@@ -1922,7 +1922,7 @@ wi::cmpu (const T1 &x, const T2 &y)
   /* Optimize the case of two HWIs.  The HWIs are implicitly sign-extended
  for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both
  values does not change the result.  */
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   unsigned HOST_WIDE_INT xl = xi.to_uhwi ();
   unsigned HOST_WIDE_INT yl = yi.to_uhwi ();
@@ -2128,7 +2128,7 @@ wi::bit_and (const T1 &x, const T2 &y)
   WIDE_INT_REF_FOR (T1) xi (x, precision);
   WIDE_INT_REF_FOR (T2) yi (y, precision);
   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   val[0] = xi.ulow () & yi.ulow ();
   result.set_len (1, is_sign_extended);
@@ -2149,7 +2149,7 @@ wi::bit_and_not (const T1 &x, const T2 &
   WIDE_INT_REF_FOR (T1) xi (x, precision);
   WIDE_INT_REF_FOR (T2) yi (y, precision);
   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   val[0] = xi.ulow () & ~yi.ulow ();
   result.set_len (1, is_sign_extended);
@@ -2170,7 +2170,7 @@ wi::bit_or (const T1 &x, const T2 &y)
   WIDE_INT_REF_FOR (T1) xi (x, precision);
   WIDE_INT_REF_FOR (T2) yi (y, precision);
   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   val[0] = xi.ulow () | yi.ulow ();
   result.set_len (1, is_sign_extended);
@@ -2191,7 +2191,7 @@ wi::bit_or_not (const T1 &x, const T2 &y
   WIDE_INT_REF_FOR (T1) xi (x, precision);
   WIDE_INT_REF_FOR (T2) yi (y, precision);
   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   val[0] = xi.ulow () | ~yi.ulow ();
   result.set_len (1, is_sign_extended);
@@ -2212,7 +2212,7 @@ wi::bit_xor (const T1 &x, const T2 &y)
   WIDE_INT_REF_FOR (T1) xi (x, precision);
   WIDE_INT_REF_FOR (T2) yi (y, precision);
   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
-  if (xi.len + yi.len == 2)
+  if (__builtin_expect (xi.len + yi.len == 2, true))
 {
   val[0] = xi.ulow () ^ yi.ulow ();
   result.set_len (1, is_sign_extended);
@@ -2248,7 +2248,7 @@ wi::add (const T1 &x, const T2 &y)
  HOST_BITS_PER_WIDE_INT are relatively rare and there's not much
  point handling them inline.  */
   else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
-  && xi.len + yi.len == 2)
+  && __builtin_expect (xi.len + yi.len == 2, true))
 {
   unsigned HOST_WIDE_INT xl = xi.ulow ();
   unsigned HOST_WIDE_INT yl = yi.ulow ();
@@ -2323,7 +2323,7 @@ wi::sub (const T1 &x, const T2 &y)
  HOST_BITS_PER_WIDE_INT are relatively rare and there's not much
  point handling them inline.  */
   else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
-  && xi.len + yi.len == 2)
+  && __builtin_expect (xi.len + yi.len == 2, true))
 {
   unsigned HOST_WIDE_INT xl = xi.ulow ();
   unsigned HOST_WIDE_INT yl = yi.ulow ();


[C,C++] integer constants in attribute arguments

2013-11-30 Thread Marc Glisse

Hello,

we currently reject:

constexpr int s = 32;
typedef double VEC __attribute__ ((__vector_size__ (s)));

and similarly for other attributes, while we accept s+0 or (int)s, etc. 
The code is basically copied from the constructor attribute. The C 
front-end is much less forgiving than the C++ one, so we need to protect 
the call to default_conversion (as in PR c/59280), and for some reason 
one of the attributes can see a FUNCTION_DECL where others see an 
IDENTIFIER_NODE, I didn't try to understand why and just added that check 
to the code.


Bootstrap and testsuite on x86_64-linux-gnu.

2013-11-30  Marc Glisse  

PR c++/53017
PR c++/59211
gcc/c-family/
* c-common.c (handle_aligned_attribute, handle_alloc_size_attribute,
handle_vector_size_attribute, handle_nonnull_attribute): Call
default_conversion on the attribute argument.
gcc/cp/
* tree.c (handle_init_priority_attribute): Likewise.
gcc/
* doc/extend.texi (Function Attributes): Typo.
gcc/testsuite/
* c-c++-common/attributes-1.c: New testcase.
* g++.dg/cpp0x/constexpr-attribute2.C: Likewise.

--
Marc GlisseIndex: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 205548)
+++ gcc/c-family/c-common.c (working copy)
@@ -7504,24 +7504,32 @@ check_cxx_fundamental_alignment_constrai
 /* Handle a "aligned" attribute; arguments as in
struct attribute_spec.handler.  */
 
 static tree
 handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
  int flags, bool *no_add_attrs)
 {
   tree decl = NULL_TREE;
   tree *type = NULL;
   int is_type = 0;
-  tree align_expr = (args ? TREE_VALUE (args)
-: size_int (ATTRIBUTE_ALIGNED_VALUE / BITS_PER_UNIT));
+  tree align_expr;
   int i;
 
+  if (args)
+{
+  align_expr = TREE_VALUE (args);
+  if (align_expr && TREE_CODE (align_expr) != IDENTIFIER_NODE)
+   align_expr = default_conversion (align_expr);
+}
+  else
+align_expr = size_int (ATTRIBUTE_ALIGNED_VALUE / BITS_PER_UNIT);
+
   if (DECL_P (*node))
 {
   decl = *node;
   type = &TREE_TYPE (decl);
   is_type = TREE_CODE (*node) == TYPE_DECL;
 }
   else if (TYPE_P (*node))
 type = node, is_type = 1;
 
   if ((i = check_user_alignment (align_expr, false)) == -1
@@ -8007,20 +8015,23 @@ handle_malloc_attribute (tree *node, tre
struct attribute_spec.handler.  */
 
 static tree
 handle_alloc_size_attribute (tree *node, tree ARG_UNUSED (name), tree args,
 int ARG_UNUSED (flags), bool *no_add_attrs)
 {
   unsigned arg_count = type_num_arguments (*node);
   for (; args; args = TREE_CHAIN (args))
 {
   tree position = TREE_VALUE (args);
+  if (position && TREE_CODE (position) != IDENTIFIER_NODE
+ && TREE_CODE (position) != FUNCTION_DECL)
+   position = default_conversion (position);
 
   if (TREE_CODE (position) != INTEGER_CST
  || TREE_INT_CST_HIGH (position)
  || TREE_INT_CST_LOW (position) < 1
  || TREE_INT_CST_LOW (position) > arg_count )
{
  warning (OPT_Wattributes,
   "alloc_size parameter outside range");
  *no_add_attrs = true;
  return NULL_TREE;
@@ -8451,20 +8462,22 @@ handle_vector_size_attribute (tree *node
  int ARG_UNUSED (flags),
  bool *no_add_attrs)
 {
   unsigned HOST_WIDE_INT vecsize, nunits;
   enum machine_mode orig_mode;
   tree type = *node, new_type, size;
 
   *no_add_attrs = true;
 
   size = TREE_VALUE (args);
+  if (size && size != error_mark_node && TREE_CODE (size) != IDENTIFIER_NODE)
+size = default_conversion (size);
 
   if (!tree_fits_uhwi_p (size))
 {
   warning (OPT_Wattributes, "%qE attribute ignored", name);
   return NULL_TREE;
 }
 
   /* Get the vector size (in bytes).  */
   vecsize = tree_to_uhwi (size);
 
@@ -8548,21 +8561,25 @@ handle_nonnull_attribute (tree *node, tr
}
   return NULL_TREE;
 }
 
   /* Argument list specified.  Verify that each argument number references
  a pointer argument.  */
   for (attr_arg_num = 1; args; args = TREE_CHAIN (args))
 {
   unsigned HOST_WIDE_INT arg_num = 0, ck_num;
 
-  if (!get_nonnull_operand (TREE_VALUE (args), &arg_num))
+  tree arg = TREE_VALUE (args);
+  if (arg && TREE_CODE (arg) != IDENTIFIER_NODE)
+   arg = default_conversion (arg);
+
+  if (!get_nonnull_operand (arg, &arg_num))
{
  error ("nonnull argument has invalid operand number (argument %lu)",
 (unsigned long) attr_arg_num);
  *no_add_attrs = true;
  return NULL_TREE;
}
 
   if (prototype_p (type))
{
  function_args_iterator iter;
Index: gcc/cp/tree.c
===
--- gcc/cp/tree.c 

Re: [PATCH] fix combine.c:reg_nonzero_bits_for_combine where last_set_mode is narrower than mode

2013-11-30 Thread Eric Botcazou
> 2013-11-29  Paulo Matos 
>Eric Botcazou 
> 
>   * combine.c (reg_nonzero_bits_for_combine): Apply mask transformation
>   as applied to nonzero_sign_valid fixing bug when last_set_mode has
>   less precision than mode.

Applied, thanks.

-- 
Eric Botcazou


Re: [PATCH] Fix up cmove expansion (PR target/58864)

2013-11-30 Thread Eric Botcazou
> Rather than adding do_pending_stack_adjust () in all the places, especially
> when it isn't clear whether emit_conditional_move will be called at all and
> whether it will actually do do_pending_stack_adjust (), I chose to add
> two new functions to save/restore the pending stack adjustment state,
> so that when instruction sequence is thrown away (either by doing
> start_sequence/end_sequence around it and not emitting it, or
> delete_insns_since) the state can be restored, and have changed all the
> places that IMHO need it for emit_conditional_move.

Why not do it in emit_conditional_move directly then?  The code thinks it's 
clever to do:

  do_pending_stack_adjust ();
  last = get_last_insn ();
  prepare_cmp_insn (XEXP (comparison, 0), XEXP (comparison, 1),
GET_CODE (comparison), NULL_RTX, unsignedp, OPTAB_WIDEN,
&comparison, &cmode);
[...]
  delete_insns_since (last);
  return NULL_RTX;

but apparently not, so why not delete the stack adjustment as well and restore 
the state afterwards?

-- 
Eric Botcazou


Re: [wide-int] Use __builtin_expect for length checks

2013-11-30 Thread Richard Biener
Richard Sandiford  wrote:
>Without profiling information, GCC tends to assume "x == 1" and
>"x + y == 2" are likely false, so this patch adds some
>__builtin_expects.
>(system.h has a dummy definition for compilers that don't support
>__builtin_expect.)
>
>Tested on x86_64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

>Thanks,
>Richard
>
>
>Index: gcc/wide-int.h
>===
>--- gcc/wide-int.h 2013-11-30 09:40:32.710196218 +
>+++ gcc/wide-int.h 2013-11-30 10:07:06.567433289 +
>@@ -1675,7 +1675,7 @@ wi::eq_p (const T1 &x, const T2 &y)
>   while (++i != xi.len);
>   return true;
> }
>-  if (yi.len == 1)
>+  if (__builtin_expect (yi.len == 1, true))
> {
>   /* XI is only equal to YI if it too has a single HWI.  */
>   if (xi.len != 1)
>@@ -1751,7 +1751,7 @@ wi::ltu_p (const T1 &x, const T2 &y)
>/* Optimize the case of two HWIs.  The HWIs are implicitly
>sign-extended
>for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both
>  values does not change the result.  */
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   unsigned HOST_WIDE_INT xl = xi.to_uhwi ();
>   unsigned HOST_WIDE_INT yl = yi.to_uhwi ();
>@@ -1922,7 +1922,7 @@ wi::cmpu (const T1 &x, const T2 &y)
>/* Optimize the case of two HWIs.  The HWIs are implicitly
>sign-extended
>for precisions greater than HOST_BITS_WIDE_INT, but sign-extending both
>  values does not change the result.  */
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   unsigned HOST_WIDE_INT xl = xi.to_uhwi ();
>   unsigned HOST_WIDE_INT yl = yi.to_uhwi ();
>@@ -2128,7 +2128,7 @@ wi::bit_and (const T1 &x, const T2 &y)
>   WIDE_INT_REF_FOR (T1) xi (x, precision);
>   WIDE_INT_REF_FOR (T2) yi (y, precision);
>   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   val[0] = xi.ulow () & yi.ulow ();
>   result.set_len (1, is_sign_extended);
>@@ -2149,7 +2149,7 @@ wi::bit_and_not (const T1 &x, const T2 &
>   WIDE_INT_REF_FOR (T1) xi (x, precision);
>   WIDE_INT_REF_FOR (T2) yi (y, precision);
>   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   val[0] = xi.ulow () & ~yi.ulow ();
>   result.set_len (1, is_sign_extended);
>@@ -2170,7 +2170,7 @@ wi::bit_or (const T1 &x, const T2 &y)
>   WIDE_INT_REF_FOR (T1) xi (x, precision);
>   WIDE_INT_REF_FOR (T2) yi (y, precision);
>   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   val[0] = xi.ulow () | yi.ulow ();
>   result.set_len (1, is_sign_extended);
>@@ -2191,7 +2191,7 @@ wi::bit_or_not (const T1 &x, const T2 &y
>   WIDE_INT_REF_FOR (T1) xi (x, precision);
>   WIDE_INT_REF_FOR (T2) yi (y, precision);
>   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   val[0] = xi.ulow () | ~yi.ulow ();
>   result.set_len (1, is_sign_extended);
>@@ -2212,7 +2212,7 @@ wi::bit_xor (const T1 &x, const T2 &y)
>   WIDE_INT_REF_FOR (T1) xi (x, precision);
>   WIDE_INT_REF_FOR (T2) yi (y, precision);
>   bool is_sign_extended = xi.is_sign_extended && yi.is_sign_extended;
>-  if (xi.len + yi.len == 2)
>+  if (__builtin_expect (xi.len + yi.len == 2, true))
> {
>   val[0] = xi.ulow () ^ yi.ulow ();
>   result.set_len (1, is_sign_extended);
>@@ -2248,7 +2248,7 @@ wi::add (const T1 &x, const T2 &y)
>  HOST_BITS_PER_WIDE_INT are relatively rare and there's not much
>  point handling them inline.  */
>   else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
>- && xi.len + yi.len == 2)
>+ && __builtin_expect (xi.len + yi.len == 2, true))
> {
>   unsigned HOST_WIDE_INT xl = xi.ulow ();
>   unsigned HOST_WIDE_INT yl = yi.ulow ();
>@@ -2323,7 +2323,7 @@ wi::sub (const T1 &x, const T2 &y)
>  HOST_BITS_PER_WIDE_INT are relatively rare and there's not much
>  point handling them inline.  */
>   else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
>- && xi.len + yi.len == 2)
>+ && __builtin_expect (xi.len + yi.len == 2, true))
> {
>   unsigned HOST_WIDE_INT xl = xi.ulow ();
>   unsigned HOST_WIDE_INT yl = yi.ulow ();




Re: [wide-int] Avoid some temporaries and use shifts more often

2013-11-30 Thread Richard Biener
Richard Sandiford  wrote:
>This started out as an another attempt to find places where we had
>things like:
>
>   offset_int x = wi::to_offset (...);
>   x = ...x...;
>
>and change them to:
>
>   offset_int x = ...wi::to_offset (...)...;
>
>with the get_ref_base_and_extent case being the main one.
>But it turned out that some of them were also multiplying or
>dividing by BITS_PER_UNIT, so it ended up also being a patch to
>convert those to shifts.

Ok and yes please.

Thanks,
Richard.

>I didn't want to cut-&-paste the 3 : log2 (BITS_PER_UNIT) conditional
>yet more times, so I added a LOG2_BITS_PER_UNIT to defaults.h.  I can
>retrofit it to the existing code if that's OK at this stage.
>
>For insn-recog.ii this reduces the number of divmod_internal calls from
>7884858 to 369746.
>
>Thanks,
>Richard
>
>
>Index: gcc/ChangeLog.wide-int
>===
>--- gcc/ChangeLog.wide-int 2013-11-29 15:09:59.623293132 +
>+++ gcc/ChangeLog.wide-int 2013-11-29 15:11:48.611155898 +
>@@ -111,6 +111,7 @@
>   (stabstr_U): Use wide-int interfaces.
>   (dbxout_type): Update to use cst_fits_shwi_p.
>   * defaults.h
>+  (LOG2_BITS_PER_UNIT): Define.
>   (TARGET_SUPPORTS_WIDE_INT): Add default.
>   * dfp.c: Include wide-int.h.
>   (decimal_real_to_integer2): Use wide-int interfaces and rename to
>Index: gcc/alias.c
>===
>--- gcc/alias.c2013-11-29 15:04:41.136142237 +
>+++ gcc/alias.c2013-11-29 15:11:48.606155857 +
>@@ -2355,8 +2355,8 @@ adjust_offset_for_component_ref (tree x,
> 
>   offset_int woffset
>   = (wi::to_offset (xoffset)
>- + wi::udiv_trunc (wi::to_offset (DECL_FIELD_BIT_OFFSET (field)),
>-   BITS_PER_UNIT));
>+ + wi::lrshift (wi::to_offset (DECL_FIELD_BIT_OFFSET (field)),
>+LOG2_BITS_PER_UNIT));
>   if (!wi::fits_uhwi_p (woffset))
>   {
> *known_p = false;
>Index: gcc/defaults.h
>===
>--- gcc/defaults.h 2013-11-29 15:04:41.136142237 +
>+++ gcc/defaults.h 2013-11-29 15:11:48.606155857 +
>@@ -475,6 +475,14 @@ #define DWARF_TYPE_SIGNATURE_SIZE 8
> #define BITS_PER_UNIT 8
> #endif
> 
>+#if BITS_PER_UNIT == 8
>+#define LOG2_BITS_PER_UNIT 3
>+#elif BITS_PER_UNIT == 16
>+#define LOG2_BITS_PER_UNIT 4
>+#else
>+#error Unknown BITS_PER_UNIT
>+#endif
>+
> #ifndef BITS_PER_WORD
> #define BITS_PER_WORD (BITS_PER_UNIT * UNITS_PER_WORD)
> #endif
>Index: gcc/dwarf2out.c
>===
>--- gcc/dwarf2out.c2013-11-29 15:04:41.136142237 +
>+++ gcc/dwarf2out.c2013-11-29 15:40:56.188806688 +
>@@ -14930,7 +14930,7 @@ field_byte_offset (const_tree decl)
> object_offset_in_bits = bitpos_int;
> 
>   object_offset_in_bytes
>-= wi::udiv_trunc (object_offset_in_bits, BITS_PER_UNIT);
>+= wi::lrshift (object_offset_in_bits, LOG2_BITS_PER_UNIT);
>   return object_offset_in_bytes.to_shwi ();
> }
> >
>Index: gcc/gimple-fold.c
>===
>--- gcc/gimple-fold.c  2013-11-29 15:04:41.136142237 +
>+++ gcc/gimple-fold.c  2013-11-29 15:41:17.425983303 +
>@@ -2926,7 +2926,6 @@ fold_nonarray_ctor_reference (tree type,
>   tree field_offset = DECL_FIELD_BIT_OFFSET (cfield);
>   tree field_size = DECL_SIZE (cfield);
>   offset_int bitoffset;
>-  offset_int byte_offset_cst = wi::to_offset (byte_offset);
>   offset_int bitoffset_end, access_end;
> 
>   /* Variable sized objects in static constructors makes no sense,
>@@ -2939,7 +2938,8 @@ fold_nonarray_ctor_reference (tree type,
> 
>   /* Compute bit offset of the field.  */
>   bitoffset = (wi::to_offset (field_offset)
>- + byte_offset_cst * BITS_PER_UNIT);
>+ + wi::lshift (wi::to_offset (byte_offset),
>+   LOG2_BITS_PER_UNIT));
>   /* Compute bit offset where the field ends.  */
>   if (field_size != NULL_TREE)
>   bitoffset_end = bitoffset + wi::to_offset (field_size);
>Index: gcc/gimple-ssa-strength-reduction.c
>===
>--- gcc/gimple-ssa-strength-reduction.c2013-11-29 15:04:41.136142237
>+
>+++ gcc/gimple-ssa-strength-reduction.c2013-11-29 15:40:56.188806688
>+
>@@ -897,7 +897,7 @@ restructure_reference (tree *pbase, tree
>   c2 = 0;
> }
> 
>-  c4 = wi::udiv_floor (index, BITS_PER_UNIT);
>+  c4 = wi::lrshift (index, LOG2_BITS_PER_UNIT);
>   c5 = backtrace_base_for_ref (&t2);
> 
>   *pbase = t1;
>Index: gcc/tree-dfa.c
>===
>--- gcc/tree-dfa.c 2013-11-29 15:04:41.136142237 +
>+++ gcc/tree-dfa.c 2013-11-29 15:41:39.513166464 +
>@@ -437,10 +437,8 

Re: [wide-int] Add a fast path for multiplication by 0

2013-11-30 Thread Richard Biener
Richard Sandiford  wrote:
>Richard Biener  writes:
>> On Fri, Nov 29, 2013 at 12:14 PM, Richard Sandiford
>>  wrote:
>>> In the fold-const.ii testcase, well over half of the mul_internal
>calls
>>> were for multiplication by 0 (106038 out of 169355).  This patch
>adds
>>> an early-out for that.
>>>
>>> Tested on x86_64-linux-gnu.  OK to install?
>>
>> Ok.  Did you check how many of the remaining are multiplies by 1?
>
>Turns out to be 9685, which is probably enough to justify a special
>case.
>
>Tested on x86_64-linux-gnu.  OK to install?

Ok.  I assume we already have a special.case for division by 1?

Thanks,
Richard.

>Thanks,
>Richard
>
>
>Index: gcc/wide-int.cc
>===
>--- gcc/wide-int.cc2013-11-29 15:04:41.177142418 +
>+++ gcc/wide-int.cc2013-11-29 15:05:36.482424592 +
>@@ -1296,6 +1296,20 @@ wi::mul_internal (HOST_WIDE_INT *val, co
>   return 1;
> }
> 
>+  /* Handle multiplications by 1.  */
>+  if (op1len == 1 && op1[0] == 1)
>+{
>+  for (i = 0; i < op2len; i++)
>+  val[i] = op2[i];
>+  return op2len;
>+}
>+  if (op2len == 1 && op2[0] == 1)
>+{
>+  for (i = 0; i < op1len; i++)
>+  val[i] = op1[i];
>+  return op1len;
>+}
>+
>   /* If we need to check for overflow, we can only do half wide
>  multiplies quickly because we need to look at the top bits to
>  check for the overflow.  */




Re: [PING^2] [PATCH] PR59063

2013-11-30 Thread Andreas Schwab
Yury Gribov  writes:

> diff --git a/gcc/testsuite/lib/asan-dg.exp b/gcc/testsuite/lib/asan-dg.exp
> index e0bf2da..06122e2 100644
> --- a/gcc/testsuite/lib/asan-dg.exp
> +++ b/gcc/testsuite/lib/asan-dg.exp
> @@ -39,9 +39,9 @@ proc asan_link_flags { paths } {
>  set shlib_ext [get_shlib_extension]
>  
>  if { $gccpath != "" } {
> +   append flags " -B${gccpath}/libsanitizer/asan/ "
>if { [file exists "${gccpath}/libsanitizer/asan/.libs/libasan.a"]
>  || [file exists 
> "${gccpath}/libsanitizer/asan/.libs/libasan.${shlib_ext}"] } {
> -   append flags " -B${gccpath}/libsanitizer/asan/ "
> append flags " -L${gccpath}/libsanitizer/asan/.libs "
> append ld_library_path ":${gccpath}/libsanitizer/asan/.libs"
>}
> diff --git a/gcc/testsuite/lib/ubsan-dg.exp b/gcc/testsuite/lib/ubsan-dg.exp
> index 4ec5fdf..b7f2b17 100644
> --- a/gcc/testsuite/lib/ubsan-dg.exp
> +++ b/gcc/testsuite/lib/ubsan-dg.exp
> @@ -30,9 +30,9 @@ proc ubsan_link_flags { paths } {
>  set shlib_ext [get_shlib_extension]
>  
>  if { $gccpath != "" } {
> +   append flags " -B${gccpath}/libsanitizer/ubsan/ "
>if { [file exists "${gccpath}/libsanitizer/ubsan/.libs/libubsan.a"]
>  || [file exists 
> "${gccpath}/libsanitizer/ubsan/.libs/libubsan.${shlib_ext}"] } {
> -   append flags " -B${gccpath}/libsanitizer/ubsan/ "
> append flags " -L${gccpath}/libsanitizer/ubsan/.libs"
> append ld_library_path ":${gccpath}/libsanitizer/ubsan/.libs"
>}

This is causing all the tests being run on all targets, even if
libsanitizer is not supported, most of them failing due to link errors.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [WWWDOCS] Document IPA/LTO/FDO/i386 changes in GCC-4.9

2013-11-30 Thread Gerald Pfeifer
On Thu, 28 Nov 2013, Jan Hubicka wrote:
> We previously renamed every static function foo into foo.1234 (just as a 
> precaution because other compilation unit may have also function foo). 
> This confuses many thins, so now we do renaming only when we see a 
> conflict.

Ah, I see.  Thanks.

> 
> + Because -fno-fat-lto-objects is now by default,

I assume you mean "now on by default" or "now enabled by default"?

> + gcc-ar and gcc-nm wrappers needs

The...wrappers

needs -> need

Fine with those tweaks.

Gerald

PS: I applied the following fix on top of your last commit.

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v
retrieving revision 1.42
diff -u -3 -p -r1.42 changes.html
--- changes.html29 Nov 2013 00:46:55 -  1.42
+++ changes.html30 Nov 2013 16:36:44 -
@@ -46,7 +46,7 @@
 Link-time optimization (LTO) improvements:
 
   Type merging was rewritten. The new implementation is significantly 
faster
- and uses less memory. 
+  and uses less memory.
   Better partitioning algorithm resulting in less streaming during
  link time.
   Early removal of virtual methods reduces the size of object files and
@@ -70,7 +70,7 @@
   Local aliases are introduced for symbols that are known to be
  semantically equivalent across shared libraries improving dynamic
  linking times.
-
+
 Feedback directed optimization improvements:
 
   Profiling of programs using C++ inline functions is now more 
reliable.


Re: [ping] [patch] contrib/config-list.mk: Allow to build all targets individually

2013-11-30 Thread Michael Eager

On 11/26/13 17:43, Jan-Benedict Glaw wrote:

On Sun, 2013-11-24 20:02:43 +0100, Jan-Benedict Glaw  wrote:

2013-11-24  Jan-Benedict Glaw  

* config-list.mk (host_options): Allow to override it.
(LIST): Change "=" to "EQUAL".
(list): New target listing all configurations.
($(LIST)): Substitute "EQUAL" back to "=".


Ping: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03121.html

   Additional to that, I'd suggest to also add microblazeel-elf and
microblaze-rtems (cf. http://gcc.gnu.org/ml/gcc/2013-11/msg00547.html
and http://gcc.gnu.org/ml/gcc/2013-11/msg00545.html), though Joern
isn't fond of the idea (cf.
http://gcc.gnu.org/ml/gcc/2013-11/msg00528.html). So I'd quite like to
see a discussion about this.


I have no objections to adding the two targets.
I think that microblaze-rtems will duplicate microblaze-elf.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


libgo patch committed: Fix 386 MakeFunc when returning struct

2013-11-30 Thread Ian Lance Taylor
On 386 when a function returns a struct the pointer to the return value
is passed as a hidden first parameter, and the function is supposed to
"ret 4" to pop the hidden parameter when returning to the caller.  The
implementation of reflect.MakeFunc in libgo was not doing that.  This
patch fixes the problem.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu, in both 32-bit and 64-bit mode.  Committed to
mainline and 4.8 branch.

Ian

diff -r fa6c22b293e8 libgo/go/reflect/makefunc_386.S
--- a/libgo/go/reflect/makefunc_386.S	Tue Nov 26 16:49:31 2013 -0800
+++ b/libgo/go/reflect/makefunc_386.S	Sat Nov 30 09:05:42 2013 -0800
@@ -26,8 +26,11 @@
 	 esp uint32		// 0x0
 	 eax uint32		// 0x4
 	 st0 uint64		// 0x8
+	 rs  int32		// 0x10
 	   }
-	*/
+	   The rs field is set by the function to a non-zero value if
+	   the function takes a struct hidden pointer that must be
+	   popped off the stack.  */
 
 	pushl	%ebp
 .LCFI0:
@@ -73,12 +76,19 @@
 	movsd	-16(%ebp), %xmm0
 #endif
 
+	movl	-8(%ebp), %edx
+
 	addl	$36, %esp
 	popl	%ebx
 .LCFI3:
 	popl	%ebp
 .LCFI4:
+
+	testl	%edx,%edx
+	jne	1f
 	ret
+1:
+	ret	$4
 .LFE1:
 #ifdef __ELF__
 	.size	reflect.makeFuncStub, . - reflect.makeFuncStub
diff -r fa6c22b293e8 libgo/go/reflect/makefuncgo_386.go
--- a/libgo/go/reflect/makefuncgo_386.go	Tue Nov 26 16:49:31 2013 -0800
+++ b/libgo/go/reflect/makefuncgo_386.go	Sat Nov 30 09:05:42 2013 -0800
@@ -16,6 +16,7 @@
 	esp uint32
 	eax uint32 // Value to return in %eax.
 	st0 uint64 // Value to return in %st(0).
+	sr  int32  // Set to non-zero if hidden struct pointer.
 }
 
 // MakeFuncStubGo implements the 386 calling convention for MakeFunc.
@@ -56,10 +57,12 @@
 	in := make([]Value, 0, len(ftyp.in))
 	ap := uintptr(regs.esp)
 
+	regs.sr = 0
 	var retPtr unsafe.Pointer
 	if retStruct {
 		retPtr = *(*unsafe.Pointer)(unsafe.Pointer(ap))
 		ap += ptrSize
+		regs.sr = 1
 	}
 
 	for _, rt := range ftyp.in {


[patch] Fix failure of ACATS c52102c

2013-11-30 Thread Eric Botcazou
Hi,

this test started to fail very recently on 32-bit platforms with 64-bit HWI.
Not sure exactly why, but the issue is straightforward and was latent.

For the following reference, a call to ao_ref_init_from_ptr_and_size yields:

(gdb) p debug_generic_expr((tree_node *) 0x76e01200)
&a[0 ...]{lb: 4294967292 sz: 4}
(gdb) p debug_generic_expr(size)
20
(gdb) p dref
$36 = {ref = 0x0, base = 0x76dfd260, offset = -137438953344, size = 160, 
  max_size = 160, ref_alias_set = 0, base_alias_set = 0, volatile_p = false}

The offset is bogus.  'a' is an array with lower bound -4 so {lb: 4294967292 
sz: 4} is actually {lb: -4 sz: 4}.  The computation of the offset goes wrong 
in get_addr_base_and_unit_offset_1 because it is not done in sizetype.

Fixed by copying the relevant bits from get_ref_base_and_extent, where the 
computation is correctly done in sizetype.

Tested on x86_64-suse-linux, OK for the mainline?


2013-11-30  Eric Botcazou  

* tree-dfa.h (get_addr_base_and_unit_offset_1) : Do the
offset computation using the precision of the index type.


2013-11-30  Eric Botcazou  

* gnat.dg/opt30.adb: New test.


-- 
Eric BotcazouIndex: tree-dfa.h
===
--- tree-dfa.h	(revision 205547)
+++ tree-dfa.h	(working copy)
@@ -102,11 +102,11 @@ get_addr_base_and_unit_offset_1 (tree ex
 		&& (unit_size = array_ref_element_size (exp),
 		TREE_CODE (unit_size) == INTEGER_CST))
 	  {
-		HOST_WIDE_INT hindex = TREE_INT_CST_LOW (index);
-
-		hindex -= TREE_INT_CST_LOW (low_bound);
-		hindex *= TREE_INT_CST_LOW (unit_size);
-		byte_offset += hindex;
+		double_int doffset
+		  = (TREE_INT_CST (index) - TREE_INT_CST (low_bound))
+		.sext (TYPE_PRECISION (TREE_TYPE (index)));
+		doffset *= tree_to_double_int (unit_size);
+		byte_offset += doffset.to_shwi ();
 	  }
 	else
 	  return NULL_TREE;
-- { dg-do run }
-- { dg-options "-O" }

procedure Opt30 is

   function Id_I (I : Integer) return Integer is
   begin
  return I;
   end;

   A : array (Integer range -4..4) of Integer;

begin
   A := (-ID_I(4), -ID_I(3), -ID_I(2), -ID_I(1), ID_I(100),
  ID_I(1), ID_I(2), ID_I(3), ID_I(4));
   A(-4..0) := A(0..4);
   if A /= (100, 1, 2, 3, 4, 1, 2, 3, 4) then
  raise Program_Error;
   end if;
end;


Re: libgo patch committed: Fix 386 MakeFunc when returning struct

2013-11-30 Thread Andreas Schwab
Ian Lance Taylor  writes:

> diff -r fa6c22b293e8 libgo/go/reflect/makefunc_386.S
> --- a/libgo/go/reflect/makefunc_386.S Tue Nov 26 16:49:31 2013 -0800
> +++ b/libgo/go/reflect/makefunc_386.S Sat Nov 30 09:05:42 2013 -0800
> @@ -26,8 +26,11 @@
>esp uint32 // 0x0
>eax uint32 // 0x4
>st0 uint64 // 0x8
> +  rs  int32  // 0x10

rs ...

> diff -r fa6c22b293e8 libgo/go/reflect/makefuncgo_386.go
> --- a/libgo/go/reflect/makefuncgo_386.go  Tue Nov 26 16:49:31 2013 -0800
> +++ b/libgo/go/reflect/makefuncgo_386.go  Sat Nov 30 09:05:42 2013 -0800
> @@ -16,6 +16,7 @@
>   esp uint32
>   eax uint32 // Value to return in %eax.
>   st0 uint64 // Value to return in %st(0).
> + sr  int32  // Set to non-zero if hidden struct pointer.

... vs. sr.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Backport libbacktrace fix to GCC 4.8 branch

2013-11-30 Thread Ian Lance Taylor
I backported the libbacktrace fix in
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01445.html to the GCC 4.8
branch.  Bootstrapped and ran libbacktrace testsuite on
x86_64-unknown-linux-gnu.  Committed to 4.8 branch.

Ian


2013-11-30  Ian Lance Taylor  

Backport from mainline:
2013-10-17  Ian Lance Taylor  

* elf.c (elf_add): Don't get the wrong offsets if a debug section
is missing.


Index: elf.c
===
--- elf.c	(revision 205552)
+++ elf.c	(working copy)
@@ -725,6 +725,8 @@ elf_add (struct backtrace_state *state,
 {
   off_t end;
 
+  if (sections[i].size == 0)
+	continue;
   if (min_offset == 0 || sections[i].offset < min_offset)
 	min_offset = sections[i].offset;
   end = sections[i].offset + sections[i].size;
@@ -751,8 +753,13 @@ elf_add (struct backtrace_state *state,
   descriptor = -1;
 
   for (i = 0; i < (int) DEBUG_MAX; ++i)
-sections[i].data = ((const unsigned char *) debug_view.data
-			+ (sections[i].offset - min_offset));
+{
+  if (sections[i].size == 0)
+	sections[i].data = NULL;
+  else
+	sections[i].data = ((const unsigned char *) debug_view.data
+			+ (sections[i].offset - min_offset));
+}
 
   if (!backtrace_dwarf_add (state, base_address,
 			sections[DEBUG_INFO].data,


Re: libgo patch committed: Fix 386 MakeFunc when returning struct

2013-11-30 Thread Ian Lance Taylor
On Sat, Nov 30, 2013 at 9:54 AM, Andreas Schwab  wrote:
> Ian Lance Taylor  writes:
>
>> diff -r fa6c22b293e8 libgo/go/reflect/makefunc_386.S
>> --- a/libgo/go/reflect/makefunc_386.S Tue Nov 26 16:49:31 2013 -0800
>> +++ b/libgo/go/reflect/makefunc_386.S Sat Nov 30 09:05:42 2013 -0800
>> @@ -26,8 +26,11 @@
>>esp uint32 // 0x0
>>eax uint32 // 0x4
>>st0 uint64 // 0x8
>> +  rs  int32  // 0x10
>
> rs ...
>
>> diff -r fa6c22b293e8 libgo/go/reflect/makefuncgo_386.go
>> --- a/libgo/go/reflect/makefuncgo_386.go  Tue Nov 26 16:49:31 2013 -0800
>> +++ b/libgo/go/reflect/makefuncgo_386.go  Sat Nov 30 09:05:42 2013 -0800
>> @@ -16,6 +16,7 @@
>>   esp uint32
>>   eax uint32 // Value to return in %eax.
>>   st0 uint64 // Value to return in %st(0).
>> + sr  int32  // Set to non-zero if hidden struct pointer.
>
> ... vs. sr.

Thanks.  Fixed.

Ian
diff -r b9fc602e9b17 libgo/go/reflect/makefunc_386.S
--- a/libgo/go/reflect/makefunc_386.S	Sat Nov 30 09:13:14 2013 -0800
+++ b/libgo/go/reflect/makefunc_386.S	Sat Nov 30 10:07:25 2013 -0800
@@ -26,9 +26,9 @@
 	 esp uint32		// 0x0
 	 eax uint32		// 0x4
 	 st0 uint64		// 0x8
-	 rs  int32		// 0x10
+	 sr  int32		// 0x10
 	   }
-	   The rs field is set by the function to a non-zero value if
+	   The sr field is set by the function to a non-zero value if
 	   the function takes a struct hidden pointer that must be
 	   popped off the stack.  */
 


[GOMP4] SIMD enabled function for C/C++

2013-11-30 Thread Iyer, Balaji V
Hello Jakub,
I was looking at my elemental function for C patch that I fixed up and 
send as requested by Aldy, and I saw two changes there that were used for C and 
C++ and they were pretty obvious. Here are the changes. Can I just commit them?

Thanks,

Balaji V. Iyer.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 205457)
+++ gcc/config/i386/i386.c  (working copy)
@@ -43701,7 +43701,7 @@
  || (clonei->simdlen & (clonei->simdlen - 1)) != 0))
 {
   warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
- "unsupported simdlen %d\n", clonei->simdlen);
+ "unsupported simdlen %d", clonei->simdlen);
   return 0;
 }

Index: gcc/omp-low.c
===
--- gcc/omp-low.c   (revision 205457)
+++ gcc/omp-low.c   (working copy)
@@ -12248,6 +12248,9 @@

   tree attr = lookup_attribute ("omp declare simd",
DECL_ATTRIBUTES (node->decl));
+  if (!attr)
+attr = lookup_attribute ("cilk plus elemental",
+DECL_ATTRIBUTES (node->decl));
   if (!attr || targetm.simd_clone.compute_vecsize_and_simdlen == NULL)
 return;
   /* Ignore


Here are the ChangeLogs:

2013-11-30  Balaji V. Iyer  

* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
Removed a carriage return from the warning string.
* omp-low.c (simd_clone_clauses_extract): Added a check for cilk plus
SIMD-enabled function attributes.




[Patch, fortran] PR58410 - [4.8/4.9 Regression] Bogus uninitialized variable warning for allocatable derived type array function result

2013-11-30 Thread Paul Richard Thomas
Dear All,

This turned out to be a valid uninitialized variable warning.
However, it was unlikely ever to cause problems at run-time.
Nonetheless, here is the fix. I am disinclined to load the testsuite
with a  fix that is so specific and localized that it simply will not
break.  However, if reviewers think otherwise, I can easily add the
original testcase.

Bootstrapped and regtested on FC17/x86_64 - OK from trunk and 4.8?

Cheers

Paul

2013-11-30  Paul Thomas  

PR fortran/58410
* trans-array.c (gfc_alloc_allocatable_for_assignment): Do not
use the array bounds of an unallocated array but set its size
to zero instead.
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c   (revision 205031)
--- gcc/fortran/trans-array.c   (working copy)
*** gfc_alloc_allocatable_for_assignment (gf
*** 8068,8073 
--- 8076,8082 
tree size1;
tree size2;
tree array1;
+   tree cond_null;
tree cond;
tree tmp;
tree tmp2;
*** gfc_alloc_allocatable_for_assignment (gf
*** 8143,8151 
jump_label2 = gfc_build_label_decl (NULL_TREE);

/* Allocate if data is NULL.  */
!   cond = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
 array1, build_int_cst (TREE_TYPE (array1), 0));
!   tmp = build3_v (COND_EXPR, cond,
  build1_v (GOTO_EXPR, jump_label1),
  build_empty_stmt (input_location));
gfc_add_expr_to_block (&fblock, tmp);
--- 8152,8160 
jump_label2 = gfc_build_label_decl (NULL_TREE);

/* Allocate if data is NULL.  */
!   cond_null = fold_build2_loc (input_location, EQ_EXPR, boolean_type_node,
   array1, build_int_cst (TREE_TYPE (array1), 0));
!   tmp = build3_v (COND_EXPR, cond_null,
  build1_v (GOTO_EXPR, jump_label1),
  build_empty_stmt (input_location));
gfc_add_expr_to_block (&fblock, tmp);
*** gfc_alloc_allocatable_for_assignment (gf
*** 8197,8209 
tmp = build1_v (LABEL_EXPR, jump_label1);
gfc_add_expr_to_block (&fblock, tmp);

!   size1 = gfc_conv_descriptor_size (desc, expr1->rank);

!   /* Get the rhs size.  Fix both sizes.  */
if (expr2)
  desc2 = rss->info->data.array.descriptor;
else
  desc2 = NULL_TREE;
size2 = gfc_index_one_node;
for (n = 0; n < expr2->rank; n++)
  {
--- 8206,8230 
tmp = build1_v (LABEL_EXPR, jump_label1);
gfc_add_expr_to_block (&fblock, tmp);

!   /* If the lhs has not been allocated, its bounds will not have been
!  initialized and so its size is set to zero.  */
!   size1 = gfc_create_var (gfc_array_index_type, NULL);
!   gfc_init_block (&alloc_block);
!   gfc_add_modify (&alloc_block, size1, gfc_index_zero_node);
!   gfc_init_block (&realloc_block);
!   gfc_add_modify (&realloc_block, size1,
! gfc_conv_descriptor_size (desc, expr1->rank));
!   tmp = build3_v (COND_EXPR, cond_null,
! gfc_finish_block (&alloc_block),
! gfc_finish_block (&realloc_block));
!   gfc_add_expr_to_block (&fblock, tmp);

!   /* Get the rhs size and fix it.  */
if (expr2)
  desc2 = rss->info->data.array.descriptor;
else
  desc2 = NULL_TREE;
+
size2 = gfc_index_one_node;
for (n = 0; n < expr2->rank; n++)
  {
*** gfc_alloc_allocatable_for_assignment (gf
*** 8217,8224 
   gfc_array_index_type,
   tmp, size2);
  }
-
-   size1 = gfc_evaluate_now (size1, &fblock);
size2 = gfc_evaluate_now (size2, &fblock);

cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
--- 8238,8243 


[Patch, fortran] PR34547 - [4.8/4.9 regression] NULL(): Fortran 2003 changes, accepts invalid, ICE on invalid

2013-11-30 Thread Paul Richard Thomas
Dear All,

This one is trivial.  NULL(...) is simply out of context in a transfer
statement.

Bootstrapped and regtested on FC17/x86_64.  OK for trunk and 4.8?

Cheers

Paul

2013-11-30  Paul Thomas  

PR fortran/34547
* resolve.c (resolve_transfer): EXPR_NULL is always in an
invalid context in a transfer statement.

2013-11-30  Paul Thomas  

PR fortran/34547
* gfortran.dg/null_5.f90 : Include new error.
* gfortran.dg/null_6.f90 : Include new error.
Index: gcc/fortran/resolve.c
===
*** gcc/fortran/resolve.c   (revision 205031)
--- gcc/fortran/resolve.c   (working copy)
*** resolve_transfer (gfc_code *code)
*** 8247,8256 
 && exp->value.op.op == INTRINSIC_PARENTHESES)
  exp = exp->value.op.op1;
  
!   if (exp && exp->expr_type == EXPR_NULL && exp->ts.type == BT_UNKNOWN)
  {
!   gfc_error ("NULL intrinsic at %L in data transfer statement requires "
!"MOLD=", &exp->where);
return;
  }
  
--- 8247,8257 
 && exp->value.op.op == INTRINSIC_PARENTHESES)
  exp = exp->value.op.op1;
  
!   if (exp && exp->expr_type == EXPR_NULL
!   && code->ext.dt)
  {
!   gfc_error ("Invalid context for NULL () intrinsic at %L",
!&exp->where);
return;
  }
  
Index: gcc/testsuite/gfortran.dg/null_5.f90
===
*** gcc/testsuite/gfortran.dg/null_5.f90(revision 205031)
--- gcc/testsuite/gfortran.dg/null_5.f90(working copy)
*** subroutine test_PR34547_1 ()
*** 34,40 
  end subroutine test_PR34547_1
  
  subroutine test_PR34547_2 ()
!   print *, null () ! { dg-error "in data transfer statement requires MOLD" }
  end subroutine test_PR34547_2
  
  subroutine test_PR34547_3 ()
--- 34,40 
  end subroutine test_PR34547_1
  
  subroutine test_PR34547_2 ()
!   print *, null () ! { dg-error "Invalid context" }
  end subroutine test_PR34547_2
  
  subroutine test_PR34547_3 ()
Index: gcc/testsuite/gfortran.dg/null_6.f90
===
*** gcc/testsuite/gfortran.dg/null_6.f90(revision 205031)
--- gcc/testsuite/gfortran.dg/null_6.f90(working copy)
*** end subroutine test_PR50375_2
*** 30,34 
  
  subroutine test_PR34547_3 ()
integer, allocatable :: i(:)
!   print *, NULL(i)
  end subroutine test_PR34547_3
--- 30,34 
  
  subroutine test_PR34547_3 ()
integer, allocatable :: i(:)
!   print *, NULL(i)! { dg-error "Invalid context for NULL" }
  end subroutine test_PR34547_3


[Patch, fortran] PR57354 - Wrong run-time assignment of allocatable array of derived type with allocatable component

2013-11-30 Thread Paul Richard Thomas
Dear All,

This is a partial fix for this problem in that it generates a
temporary to provide a correct assignment but then goes on to do an
unnecessary reallocation of the lhs.  That is to say, the temporary
could be taken over by the array descriptor.  At the moment, I could
not see a good way to do this.  I propose to change the PR to reflect
this.  I will retain the PR and will have another go at suppressing
the reallocation in a few weeks time.

Bootstrapped and regtested on Fc17/x86_64 - OK for trunk?

Cheers

Paul


2013-11-30  Paul Thomas  

PR fortran/57354
* trans-array.c (gfc_conv_resolve_dependencies): For other than
SS_SECTION, do a dependency check if the lhs is liable to be
reallocated.

2013-11-30  Paul Thomas  

PR fortran/57354
* gfortran.dg/realloc_on_assign_23.f90 : New test
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c   (revision 205031)
--- gcc/fortran/trans-array.c   (working copy)
*** gfc_conv_resolve_dependencies (gfc_loopi
*** 4335,4344 
  
for (ss = rss; ss != gfc_ss_terminator; ss = ss->next)
  {
if (ss->info->type != GFC_SS_SECTION)
!   continue;
  
!   ss_expr = ss->info->expr;
  
if (dest_expr->symtree->n.sym != ss_expr->symtree->n.sym)
{
--- 4335,4352 
  
for (ss = rss; ss != gfc_ss_terminator; ss = ss->next)
  {
+   ss_expr = ss->info->expr;
+ 
if (ss->info->type != GFC_SS_SECTION)
!   {
! if (gfc_option.flag_realloc_lhs
! && dest_expr != ss_expr
! && gfc_is_reallocatable_lhs (dest_expr)
! && ss_expr->rank)
!   nDepend = gfc_check_dependency (dest_expr, ss_expr, true);
  
! continue;
!   }
  
if (dest_expr->symtree->n.sym != ss_expr->symtree->n.sym)
{
Index: gcc/testsuite/gfortran.dg/realloc_on_assign_23.f90
===
*** gcc/testsuite/gfortran.dg/realloc_on_assign_23.f90  (revision 0)
--- gcc/testsuite/gfortran.dg/realloc_on_assign_23.f90  (working copy)
***
*** 0 
--- 1,30 
+ ! { dg-do run }
+ !
+ ! PR fortran/57354
+ !
+ ! Contributed by Vladimir Fuka  
+ !
+   type t
+ integer,allocatable :: i
+   end type
+ 
+   type(t) :: e
+   type(t), allocatable :: a(:)
+   integer :: chksum = 0
+ 
+   do i=1,3   ! Was 100 in original
+ e%i = i
+ chksum = chksum + i
+ if (.not.allocated(a)) then
+   a = [e]
+ else
+   call foo
+ end if
+   end do
+ 
+   if (sum ([(a(i)%i, i=1,size(a))]) .ne. chksum) call abort
+ contains
+   subroutine foo
+ a = [a, e]
+   end subroutine
+ end


RE: [GOMP4] SIMD enabled function for C/C++

2013-11-30 Thread Iyer, Balaji V
Hi Jakub,
Well, it turns out that I need to do a couple more changes than that 
one change in omp-low.c So, please ignore that. I will check in the changes in 
i386.c as obvious since all it involves is removing a '\n.' in the error string.

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Saturday, November 30, 2013 1:16 PM
> To: Jakub Jelinek
> Cc: Aldy Hernandez (al...@redhat.com); 'gcc-patches@gcc.gnu.org'
> Subject: [GOMP4] SIMD enabled function for C/C++
> 
> Hello Jakub,
>   I was looking at my elemental function for C patch that I fixed up and
> send as requested by Aldy, and I saw two changes there that were used for C
> and C++ and they were pretty obvious. Here are the changes. Can I just
> commit them?
> 
> Thanks,
> 
> Balaji V. Iyer.
> 
> Index: gcc/config/i386/i386.c
> ==
> =
> --- gcc/config/i386/i386.c  (revision 205457)
> +++ gcc/config/i386/i386.c  (working copy)
> @@ -43701,7 +43701,7 @@
>   || (clonei->simdlen & (clonei->simdlen - 1)) != 0))
>  {
>warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
> - "unsupported simdlen %d\n", clonei->simdlen);
> + "unsupported simdlen %d", clonei->simdlen);
>return 0;
>  }
> 
> Index: gcc/omp-low.c
> ==
> =
> --- gcc/omp-low.c   (revision 205457)
> +++ gcc/omp-low.c   (working copy)
> @@ -12248,6 +12248,9 @@
> 
>tree attr = lookup_attribute ("omp declare simd",
> DECL_ATTRIBUTES (node->decl));
> +  if (!attr)
> +attr = lookup_attribute ("cilk plus elemental",
> +DECL_ATTRIBUTES (node->decl));
>if (!attr || targetm.simd_clone.compute_vecsize_and_simdlen == NULL)
>  return;
>/* Ignore
> 
> 
> Here are the ChangeLogs:
> 
> 2013-11-30  Balaji V. Iyer  
> 
> * config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
> Removed a carriage return from the warning string.
> * omp-low.c (simd_clone_clauses_extract): Added a check for cilk plus
> SIMD-enabled function attributes.
> 



RE: [PING]: [GOMP4] [PATCH] SIMD-Enabled Functions (formerly Elemental functions) for C

2013-11-30 Thread Iyer, Balaji V
Hello Aldy,
Some of the middle end changes I made in the previous patch was not 
flying for the C++. Here is a fixed patch where the middle-end changes will 
work for both C and C++.
With this email, I am attaching the patch for C along with the middle 
end changes. Is this Ok for the branch?

Here are the ChangeLog entries:
gcc/ChangeLog
2013-11-30  Balaji V. Iyer  

* omp-low.c (expand_simd_clones): Added a new parameter called "type."
(ipa_omp_simd_clone): Added a call to expand_simd_clones when Cilk Plus
is enabled.

gcc/c-family/ChangeLog
2013-11-30  Balaji V. Iyer  

* c-common.c (c_common_attribute_table): Added "cilk plus elemental"
attribute.

gcc/c/ChangeLog
2013-11-30  Balaji V. Iyer  

* c-parser.c (struct c_parser::elem_fn_tokens): Added new field.
(c_parser_declaration_or_fndef): Added a check if elem_fn_tokens
field in parser is not empty.  If not-empty, call the function
c_parser_finish_omp_declare_simd.
(c_parser_elem_fn_vectorlength): New function.
(c_parser_elem_fn_expr_list): Likewise.
(c_finish_elem_fn_tokens): Likewise.
(c_parser_attributes): Added a elem_fn_tokens parameter.  Added a
check for vector attribute and if so call c_parser_elem_fn_expr_list.
Also, called c_finish_elem_fn_tokens when Cilk Plus is enabled.
(c_finish_omp_declare_simd): Added a check if elem_fn_tokens in
parser field is non-empty.  If so, parse them as you would parse
the omp declare simd pragma.

gcc/testsuite/ChangeLog
2013-11-30  Balaji V. Iyer  

* c-c++-common/cilk-plus/EF/ef_test.c: New test.
* c-c++-common/cilk-plus/EF/ef_test2.c: Likewise.
* c-c++-common/cilk-plus/EF/vlength_errors.c: Likewise.
* c-c++-common/cilk-plus/EF/ef_error.c: Likewise.
* c-c++-common/cilk-plus/EF/ef_error2.c: Likewise.
* gcc.dg/cilk-plus/cilk-plus.exp: Added calls for the above tests.

Thanks,

Balaji V. Iyer.


> -Original Message-
> From: Iyer, Balaji V
> Sent: Wednesday, November 27, 2013 1:15 PM
> To: al...@redhat.com
> Cc: Jakub Jelinek; gcc-patches@gcc.gnu.org
> Subject: RE: [PING]: [GOMP4] [PATCH] SIMD-Enabled Functions (formerly
> Elemental functions) for C
> 
> HI Aldy and Jakub,
>   Attached, please find a fixed patch. I have fixed all the changes you
> have mentioned below. Is this OK to install?
> 
> Here are the ChangeLog entries:
> gcc/ChangeLog
> 2013-11-27  Balaji V. Iyer  
> 
> * config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
> Removed a carriage return from the warning string.
> * omp-low.c (simd_clone_clauses_extract): Added a check for cilk plus
> SIMD-enabled function attributes.
> 
> gcc/c/ChangeLog
> 2013-11-27  Balaji V. Iyer  
> 
> * c-parser.c (struct c_parser::elem_fn_tokens): Added new field.
> (c_parser_declaration_or_fndef): Added a check if elem_fn_tokens
> field in parser is not empty.  If not-empty, call the function
> c_parser_finish_omp_declare_simd.
> (c_parser_elem_fn_vectorlength): New function.
> (c_parser_elem_fn_expr_list): Likewise.
> (c_finish_elem_fn_tokens): Likewise.
> (c_parser_attributes): Added a elem_fn_tokens parameter.  Added a
> check for vector attribute and if so call c_parser_elem_fn_expr_list.
> Also, called c_finish_elem_fn_tokens when Cilk Plus is enabled.
> (c_finish_omp_declare_simd): Added a check if elem_fn_tokens in
> parser field is non-empty.  If so, parse them as you would parse
> the omp declare simd pragma.
> 
> gcc/testsuite/ChangeLog
> 2013-11-27  Balaji V. Iyer  
> 
> * c-c++-common/cilk-plus/EF/ef_test.c: New test.
> * c-c++-common/cilk-plus/EF/ef_test2.c: Likewise.
> * c-c++-common/cilk-plus/EF/vlength_errors.c: Likewise.
> * c-c++-common/cilk-plus/EF/ef_error.c: Likewise.
> * c-c++-common/cilk-plus/EF/ef_error2.c: Likewise.
> * gcc.dg/cilk-plus/cilk-plus.exp: Added calls for the above tests.
> 
> 
> Thanks,
> 
> Balaji V. Iyer.
> 
> > -Original Message-
> > From: Aldy Hernandez [mailto:al...@redhat.com]
> > Sent: Wednesday, November 27, 2013 10:52 AM
> > To: Iyer, Balaji V
> > Cc: Jakub Jelinek; gcc-patches@gcc.gnu.org
> > Subject: Re: [PING]: [GOMP4] [PATCH] SIMD-Enabled Functions (formerly
> > Elemental functions) for C
> >
> > "Iyer, Balaji V"  writes:
> >
> > >  c_finish_omp_declare_simd (c_parser *parser, tree fndecl, tree parms,
> > >  vec clauses)
> > >  {
> > > +
> > > +  if (flag_enable_cilkplus
> > > +  && clauses.exists () && !vec_safe_is_empty (parser-
> > >elem_fn_tokens))
> > > +{
> > > +  error ("%<#pragma omp declare simd%> cannot be used in the
> same"
> > > +  "function marked as a SIMD-enabled function");
> > > +  vec_free (parser->elem_fn_tokens);
> > > +  return;
> > 

RE: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++

2013-11-30 Thread Iyer, Balaji V
Hello Everyone,
The changes mentioned in 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03506.html is also applicable to 
my C++ patch. With this email, I am attaching a fixed patch.

Here are the ChangeLog entries:

gcc/cp/ChangeLog
2013-11-30  Balaji V. Iyer  

* decl2.c (is_late_template_attribute): Added a check for SIMD-enabled
functions attribute.  If found, return true.
* parser.c (cp_parser_direct_declarator): When Cilk Plus is enabled
see if there is an attribute after function decl.  If so, then
parse them now.
(cp_parser_late_return_type_opt): Handle parsing of Cilk Plus SIMD
enabled function late parsing.
(cp_parser_gnu_attribute_list): Parse all the tokens for the vector
attribute for a SIMD-enabled function.
(cp_parser_omp_all_clauses): Skip parsing to the end of pragma when
the function is used by SIMD-enabled function (indicated by NULL
pragma token).
(cp_parser_elem_fn_vectorlength): New function.
(cp_parser_elem_fn_expr_list): Likewise.
(cp_parser_late_parsing_elem_fn_info): Likewise.
* parser.h (cp_parser::elem_fn_info): New field.
* decl.c (grokfndecl): Added a check if Cilk Plus is enabled and
if so, adjust the Cilk Plus SIMD-enabled function attributes.


gcc/testsuite/ChangeLog
2013-11-30  Balaji V. Iyer  

* g++.dg/cilk-plus/cilk-plus.exp: Called the C/C++ common tests for
SIMD enabled function.
* g++.dg/cilk-plus/ef_test.C: New test.

Is this OK for branch?

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Iyer, Balaji V
> Sent: Wednesday, November 20, 2013 6:19 PM
> To: Jakub Jelinek
> Cc: Aldy Hernandez (al...@redhat.com); Jeff Law; gcc-patches@gcc.gnu.org
> Subject: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental
> functions) for C++
> 
> Hello Everyone,
>   Attached, please find a patch that will implement SIMD-enabled
> functions for C++ targeting the gomp-4_0-branch. Here are the Changelog
> entries. Is this OK to install?
> 
> gcc/cp/ChangeLog
> 2013-11-20  Balaji V. Iyer  
> 
> * parser.c (cp_parser_direct_declarator): When Cilk Plus is enabled
> see if there is an attribute after function decl.  If so, then
> parse them now.
> (cp_parser_late_return_type_opt): Handle parsing of Cilk Plus SIMD
> enabled function late parsing.
> (cp_parser_gnu_attribute_list): Parse all the tokens for the vector
> attribute for a SIMD-enabled function.
> (cp_parser_omp_all_clauses): Skip parsing to the end of pragma when
> the function is used by SIMD-enabled function (indicated by NULL
> pragma token).
> (cp_parser_elem_fn_vectorlength): New function.
> (cp_parser_elem_fn_expr_list): Likewise.
> (cp_parser_late_parsing_elem_fn_info): Likewise.
> * parser.h (cp_parser::elem_fn_info): New field.
> 
> gcc/testsuite/ChangeLog
> 2013-11-20  Balaji V. Iyer  
> 
> * g++.dg/cilk-plus/cilk-plus.exp: Called the C/C++ common tests for
> SIMD enabled function.
> * g++.dg/cilk-plus/ef_test.C: New test.
> 
> 
> Thanking You,
> 
> Yours Sincerely,
> 
> Balaji V. Iyer.
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c   (revision 205562)
+++ gcc/cp/decl.c   (working copy)
@@ -7669,6 +7669,34 @@
}
 }
 
+  if (flag_enable_cilkplus)
+{
+  /* Adjust "cilk plus elemental attribute" attributes.  */
+  tree ods = lookup_attribute ("cilk plus elemental", *attrlist);
+  if (ods)
+   {
+ tree attr;
+ for (attr = ods; attr; 
+  attr = lookup_attribute ("cilk plus elemental",
+   TREE_CHAIN (attr)))
+   {
+ if (TREE_CODE (type) == METHOD_TYPE)
+   walk_tree (&TREE_VALUE (attr), declare_simd_adjust_this,
+  DECL_ARGUMENTS (decl), NULL);
+ if (TREE_VALUE (attr) != NULL_TREE)
+   {
+ tree cl = TREE_VALUE (TREE_VALUE (attr));
+ cl = c_omp_declare_simd_clauses_to_numbers
+   (DECL_ARGUMENTS (decl), cl);
+ if (cl)
+   TREE_VALUE (TREE_VALUE (attr)) = cl;
+ else
+   TREE_VALUE (attr) = NULL_TREE;
+   }
+   }
+   }
+}
+  
   /* Caller will do the rest of this.  */
   if (check < 0)
 return decl;
Index: gcc/cp/pt.c
===
--- gcc/cp/pt.c (revision 205562)
+++ gcc/cp/pt.c (working copy)
@@ -8603,9 +8603,12 @@
{
  *p = TREE_CHAIN (t);
  TREE_CHAIN (t) = NULL_TREE;
- if (flag_openmp
- && is_attribute_p ("omp declare simd",
-get_attrib

Re: LRA vs reload on powerpc: 2 extra FAILs that are actually improvements?

2013-11-30 Thread Alan Modra
> On Sat, Nov 2, 2013 at 6:48 PM, Steven Bosscher  wrote:
> > The failure of pr53199.c is because of different instruction selection
> > for bswap. Test case is reduced to just one function:
[snip]
> > Is this an improvement or a regression? If it's an improvement then
> > these two test cases should be adjusted :-)

As David said, going through memory is bad, we get a load-hit-store
flush.  Definitely a regression on power7.  Does anyone know why the
bswapdi2_64bit r,r alternative is disparaged?  Seems like it has been
that way since the orginal mainline commit.

int main (void)
{
  int i;
  long ret = 0;
  long tmp1, tmp2, tmp3;

  for (i = 0; i < 10; i++)
#if MEM == 1
/* From pr53199.c reg_reverse, -mlra -mcpu=power6 -mtune=power7.  */
__asm__ __volatile__ ("\
addi %1,1,-16\n\
srdi %3,%0,32\n\
li %2,4\n\
stwbrx %0,0,%1\n\
stwbrx %3,%2,%1\n\
ld %0,-16(1)" : "+r" (ret), "=&b" (tmp1), "=&r" (tmp2), "=&r" (tmp3));
#elif MEM == 2
/* From pr53199.c reg_reverse, -mlra -mcpu=power6.  */
__asm__ __volatile__ ("\
addi %1,1,-16\n\
srdi %3,%0,32\n\
addi %2,%1,4\n\
stwbrx %0,0,%1\n\
stwbrx %3,0,%2\n\
ld %0,-16(1)" : "+r" (ret), "=&b" (tmp1), "=&b" (tmp2), "=&r" (tmp3));
#elif MEM == 3
/* From pr53199.c reg_reverse, -mlra -mcpu=power7.  */
__asm__ __volatile__ ("\
std %0,-16(1)\n\
addi %1,1,-16\n\
ldbrx %0,0,%1\n" : "+r" (ret), "=&b" (tmp1));
#else
__asm__ __volatile__ ("\
srdi %1,%0,32\n\
rlwinm %2,%0,8,0x\n\
rlwinm %3,%1,8,0x\n\
rlwimi %2,%0,24,0,7\n\
rlwimi %2,%0,24,16,23\n\
rlwimi %3,%1,24,0,7\n\
rlwimi %3,%1,24,16,23\n\
sldi %2,%2,32\n\
or %2,%2,%3\n\
mr %0,%2" : "+r" (ret), "=&r" (tmp1), "=&r" (tmp2), "=&r" (tmp3));
#endif
  return ret;
}

/*
amodra@bns:~> gcc -O2 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m3.096s
user0m3.089s
sys 0m0.001s
amodra@bns:~> time ./a.out 

real0m3.096s
user0m3.094s
sys 0m0.002s
amodra@bns:~> gcc -O2 -DMEM=1 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m12.661s
user0m12.657s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.003s
amodra@bns:~> gcc -O2 -DMEM=2 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.004s
amodra@bns:~> gcc -O2 -DMEM=3 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m10.279s
user0m10.276s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m10.279s
user0m10.276s
sys 0m0.003s

I also looked at the register version and -DMEM=1 case with power7
simulators finding that the register version had a delay of 12 cycles
from completion of the first instruction to completion of the last.
The -DMEM=1 case had a corresponding delay of 49 cycles, which matches
the loop timing above quite well.
*/

-- 
Alan Modra
Australia Development Lab, IBM