Re: [PATCH] A new reload-rewrite pattern recognizer for GCC vectorizer.

2014-04-26 Thread Jakub Jelinek
On Thu, Apr 24, 2014 at 05:32:54PM -0700, Cong Hou wrote:
> In this patch a new reload-rewrite pattern detector is composed to
> handle the following pattern in the loop being vectorized:
> 
>x = *p;
>...
>y = *p;
> 
>or
> 
>*p = x;
>...
>y = *p;
> 
> In both cases, *p is reloaded because there may exist other defs to
> another memref that may alias with p.  However, aliasing is eliminated
> with alias checks.  Then we can safely replace the last statement in
> above cases by y = x.

Not safely, at least not for #pragma omp simd/#pragma simd/#pragma ivdep
loops if alias analysis hasn't proven there is no aliasing.

So, IMNSHO you need to guard this with LOOP_VINFO_NO_DATA_DEPENDENCIES,
assuming it has been computed at that point already (otherwise you need to
do it elsewhere).

Consider:

void
foo (int *p, int *q)
{
  int i;
  #pragma omp simd safelen(16)
  for (i = 0; i < 128; i++)
{
  int x = *p;
  *q += 8;
  *p = *p + x;
  p++;
  q++;
}
}

It is valid to call the above with completely unrelated p and q, but
also e.g. p == q, or q == p + 16 or p == q + 16.
Your patch would certainly break it e.g. for p == q.

Jakub


Re: [wide-int 5/5] Add dump () method for gdb debugging

2014-04-26 Thread Richard Sandiford
Kenneth Zadeck  writes:
> don't you think that it would be easier to understand the number if you 
> printed it largest index first, as in the routines in wide-int-print.cc?

Yeah, that's what the patch does.  E.g. (for 32-bit HWI):

  [...,0x3,0x8000]

is 7 << 31.

Thanks,
Richard

>
> kenny
> On 04/25/2014 09:58 AM, Richard Sandiford wrote:
>> This patch adds a dump () method so that it's easier to read the
>> contents of the various wide-int types in gdb.  I've deliberately not
>> done any extension for "small_prec" cases because I think here
>> we want to see the raw values as much as possible.
>>
>> Tested on x86_64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> Index: gcc/wide-int.cc
>> ===
>> --- gcc/wide-int.cc  2014-04-25 14:48:03.263213403 +0100
>> +++ gcc/wide-int.cc  2014-04-25 14:48:25.186333842 +0100
>> @@ -2130,3 +2130,9 @@ wi::only_sign_bit_p (const wide_int_ref
>>   void gt_ggc_mx (widest_int *) { }
>>   void gt_pch_nx (widest_int *, void (*) (void *, void *), void *) { }
>>   void gt_pch_nx (widest_int *) { }
>> +
>> +template void wide_int::dump () const;
>> +template void generic_wide_int  >::dump () 
>> const;
>> +template void generic_wide_int  >::dump () 
>> const;
>> +template void offset_int::dump () const;
>> +template void widest_int::dump () const;
>> Index: gcc/wide-int.h
>> ===
>> --- gcc/wide-int.h   2014-04-25 14:34:54.823652619 +0100
>> +++ gcc/wide-int.h   2014-04-25 14:48:06.218229309 +0100
>> @@ -709,6 +709,9 @@ #define INCDEC_OPERATOR(OP, DELTA) \
>>   #undef ASSIGNMENT_OPERATOR
>>   #undef INCDEC_OPERATOR
>>   
>> +  /* Debugging functions.  */
>> +  void dump () const;
>> +
>> static const bool is_sign_extended
>>   = wi::int_traits  >::is_sign_extended;
>>   };
>> @@ -859,6 +862,23 @@ generic_wide_int ::operator = (
>> return *this;
>>   }
>>   
>> +/* Dump the contents of the integer to stderr, for debugging.  */
>> +template 
>> +void
>> +generic_wide_int ::dump () const
>> +{
>> +  unsigned int len = this->get_len ();
>> +  const HOST_WIDE_INT *val = this->get_val ();
>> +  unsigned int precision = this->get_precision ();
>> +  fprintf (stderr, "[");
>> +  if (len * HOST_BITS_PER_WIDE_INT < precision)
>> +fprintf (stderr, "...,");
>> +  for (unsigned int i = 0; i < len - 1; ++i)
>> +fprintf (stderr, HOST_WIDE_INT_PRINT_HEX ",", val[len - 1 - i]);
>> +  fprintf (stderr, HOST_WIDE_INT_PRINT_HEX "], precision = %d\n",
>> +   val[0], precision);
>> +}
>> +
>>   namespace wi
>>   {
>> template <>


Re: [patch, fortran] Fix PR 59604

2014-04-26 Thread Tobias Burnus

Am 12.04.2014 21:33, schrieb Thomas Koenig:

please find attached a patch for PR 59604.

The patch makes sure that, if -fno-range-check is specified,
using int on an overflowing boz constant yields the same
result for compile-time simplification and run-time
execution.

OK for trunk?


Looks good to me. At a glance, it looks as if it also fixes the original 
bug 58003 - or is something still missing?


Tobias


2014-03-12  Thomas Koenig  

 PR fortran/59604
 * gfortran.h (gfc_convert_mpz_to_signed):  Add prototype.
 * arith.c (gfc_int2int):  Convert number to signed if
 arithmetic overflow is not checked.
 * simplify.c (convert_mpz_to_unsigned): Only trigger assert for
 size if range checking is in force.
 (convert_mpz_to_signed):  Make non-static, rename to
 (gfc_convert_mpz_to_signed).
 (simplify_dshift): Use gfc_convert_mpz_to_signed.
 (gfc_simplify_ibclr):  Likewise.
 (gfc_simplify_ibits):  Likewise.
 (gfc_simplify_ibset):  Likewise.
 (simplify_shift):  Likewise.
 (gfc_simplify_ishiftc):  Likewise.
 (gfc_simplify_maskr):  Likewise.
 (gfc_simplify_maskl):  Likewise.

2014-03-12  Thomas Koenig  

 PR fortran/59604
 * gfortran.dg/no_range_check_3.f90:  New test.





[C PATCH] Another locus improvements (PR c/60257)

2014-04-26 Thread Marek Polacek
Another column info improvements, this time for initializer
warnings.  warning_init and add_pending_init had to gain new location
parameter.

What is worth mentioning is that the "(near initialization for X)"
message seems next to useless to me now with caret diagnostics (?).

Regtested/bootstrapped on powerpc64-unknown-linux-gnu and
x86_64-unknown-linux-gnu, ok for trunk?

2014-04-25  Marek Polacek  

PR c/60257
* c-typeck.c (warning_init): Add location_t parameter.  Call
warning_at instead of warning.
(push_init_level): Pass input_location to warning_init.
(add_pending_init): Add location_t parameter.  Pass loc to
warning_init.
(set_nonincremental_init): Pass input_location to add_pending_init.
(set_nonincremental_init_from_string): Likewise.
(output_init_element): Pass loc to warning_init and to
add_pending_init.

* gcc.dg/pr60257.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index a82801f..5a232ec 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -100,14 +100,15 @@ static void push_string (const char *);
 static void push_member_name (tree);
 static int spelling_length (void);
 static char *print_spelling (char *);
-static void warning_init (int, const char *);
+static void warning_init (location_t, int, const char *);
 static tree digest_init (location_t, tree, tree, tree, bool, bool, int);
 static void output_init_element (location_t, tree, tree, bool, tree, tree, int,
 bool, struct obstack *);
 static void output_pending_init_elements (int, struct obstack *);
 static int set_designator (int, struct obstack *);
 static void push_range_stack (tree, struct obstack *);
-static void add_pending_init (tree, tree, tree, bool, struct obstack *);
+static void add_pending_init (location_t, tree, tree, tree, bool,
+ struct obstack *);
 static void set_nonincremental_init (struct obstack *);
 static void set_nonincremental_init_from_string (tree, struct obstack *);
 static tree find_init_member (tree, struct obstack *);
@@ -6445,15 +6446,15 @@ pedwarn_init (location_t location, int opt, const char 
*gmsgid)
component name is taken from the spelling stack.  */
 
 static void
-warning_init (int opt, const char *gmsgid)
+warning_init (location_t loc, int opt, const char *gmsgid)
 {
   char *ofwhat;
 
   /* The gmsgid may be a format string with %< and %>. */
-  warning (opt, gmsgid);
+  warning_at (loc, opt, gmsgid);
   ofwhat = print_spelling ((char *) alloca (spelling_length () + 1));
   if (*ofwhat)
-warning (opt, "(near initialization for %qs)", ofwhat);
+warning_at (loc, opt, "(near initialization for %qs)", ofwhat);
 }
 
 /* If TYPE is an array type and EXPR is a parenthesized string
@@ -7299,7 +7300,8 @@ push_init_level (int implicit, struct obstack * 
braced_init_obstack)
   if (implicit == 1 && warn_missing_braces && !missing_braces_mentioned)
 {
   missing_braces_mentioned = 1;
-  warning_init (OPT_Wmissing_braces, "missing braces around initializer");
+  warning_init (input_location, OPT_Wmissing_braces,
+   "missing braces around initializer");
 }
 
   if (TREE_CODE (constructor_type) == RECORD_TYPE
@@ -7360,7 +7362,7 @@ push_init_level (int implicit, struct obstack * 
braced_init_obstack)
   else
 {
   if (constructor_type != error_mark_node)
-   warning_init (0, "braces around scalar initializer");
+   warning_init (input_location, 0, "braces around scalar initializer");
   constructor_fields = constructor_type;
   constructor_unfilled_fields = constructor_type;
 }
@@ -7775,8 +,8 @@ set_init_label (tree fieldname, struct obstack * 
braced_init_obstack)
existing initializer.  */
 
 static void
-add_pending_init (tree purpose, tree value, tree origtype, bool implicit,
- struct obstack * braced_init_obstack)
+add_pending_init (location_t loc, tree purpose, tree value, tree origtype,
+ bool implicit, struct obstack *braced_init_obstack)
 {
   struct init_node *p, **q, *r;
 
@@ -7797,9 +7799,12 @@ add_pending_init (tree purpose, tree value, tree 
origtype, bool implicit,
  if (!implicit)
{
  if (TREE_SIDE_EFFECTS (p->value))
-   warning_init (0, "initialized field with side-effects 
overwritten");
+   warning_init (loc, 0,
+ "initialized field with side-effects "
+ "overwritten");
  else if (warn_override_init)
-   warning_init (OPT_Woverride_init, "initialized field 
overwritten");
+   warning_init (loc, OPT_Woverride_init,
+ "initialized field overwritten");
}
  p->value = value;
  p->origtype = origtype;
@@ -7824,9 +7829,12 @@ add_pending_init (tree purpose, tree value, tr

Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations

2014-04-26 Thread Tom de Vries

On 13-03-14 21:49, Richard Henderson wrote:

  (define_expand "ldexpxf3"
-  [(set (match_dup 3)
-   (float:XF (match_operand:SI 2 "register_operand")))
-   (parallel [(set (match_operand:XF 0 " register_operand")
-  (unspec:XF [(match_operand:XF 1 "register_operand")
-  (match_dup 3)]
- UNSPEC_FSCALE_FRACT))
- (set (match_dup 4)
-  (unspec:XF [(match_dup 1) (match_dup 3)]
- UNSPEC_FSCALE_EXP))])]
+  [(match_operand:XF 0 "register_operand")
+   (match_operand:XF 1 "register_operand")
+   (match_operand:SI 2 "register_operand")]
"TARGET_USE_FANCY_MATH_387
 && flag_unsafe_math_optimizations"
  {
@@ -14808,6 +14633,11 @@

operands[3] = gen_reg_rtx (XFmode);
operands[4] = gen_reg_rtx (XFmode);
+
+  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
+  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
+ operands[1], operands[3]));
+  DONE;
  })


Richard,

For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a 
sigsegv.

I've traced it back to this code in insn-emit.c:
...
rtx
gen_ldexpxf3 (rtx operand0,
rtx operand1,
rtx operand2)
{
  rtx _val = 0;
  start_sequence ();
  {
rtx operands[3];
operands[0] = operand0;
operands[1] = operand1;
operands[2] = operand2;

{
  if (optimize_insn_for_size_p ())
FAIL;

  operands[3] = gen_reg_rtx (XFmode);
  operands[4] = gen_reg_rtx (XFmode);
...

operands is declared with size 3, and operands[3,4] accesses are out of bounds.

I've done a minimal build with attached patch, and reran the test-case, which 
passes now.


OK if bootstrap succeeds?

Thanks,
- Tom
2014-04-26  Tom de Vries  

	* config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds
	array accesses.
---
 gcc/config/i386/i386.md | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 25e2e93..9f103cf 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -14427,15 +14427,16 @@
   "TARGET_USE_FANCY_MATH_387
&& flag_unsafe_math_optimizations"
 {
+  rtx tmp1, tmp2;
   if (optimize_insn_for_size_p ())
 FAIL;
 
-  operands[3] = gen_reg_rtx (XFmode);
-  operands[4] = gen_reg_rtx (XFmode);
+  tmp1 = gen_reg_rtx (XFmode);
+  tmp2 = gen_reg_rtx (XFmode);
 
-  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
-  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
- operands[1], operands[3]));
+  emit_insn (gen_floatsixf2 (tmp1, operands[2]));
+  emit_insn (gen_fscalexf4_i387 (operands[0], tmp2,
+ operands[1], tmp1));
   DONE;
 })
 
-- 
1.8.3.2



Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations

2014-04-26 Thread Uros Bizjak
On Sat, Apr 26, 2014 at 11:27 AM, Tom de Vries  wrote:
> On 13-03-14 21:49, Richard Henderson wrote:
>>
>>   (define_expand "ldexpxf3"
>> -  [(set (match_dup 3)
>> -   (float:XF (match_operand:SI 2 "register_operand")))
>> -   (parallel [(set (match_operand:XF 0 " register_operand")
>> -  (unspec:XF [(match_operand:XF 1 "register_operand")
>> -  (match_dup 3)]
>> - UNSPEC_FSCALE_FRACT))
>> - (set (match_dup 4)
>> -  (unspec:XF [(match_dup 1) (match_dup 3)]
>> - UNSPEC_FSCALE_EXP))])]
>> +  [(match_operand:XF 0 "register_operand")
>> +   (match_operand:XF 1 "register_operand")
>> +   (match_operand:SI 2 "register_operand")]
>> "TARGET_USE_FANCY_MATH_387
>>  && flag_unsafe_math_optimizations"
>>   {
>> @@ -14808,6 +14633,11 @@
>>
>> operands[3] = gen_reg_rtx (XFmode);
>> operands[4] = gen_reg_rtx (XFmode);
>> +
>> +  emit_insn (gen_floatsixf2 (operands[3], operands[2]));
>> +  emit_insn (gen_fscalexf4_i387 (operands[0], operands[4],
>> + operands[1], operands[3]));
>> +  DONE;
>>   })
>
>
> Richard,
>
> For a non-bootstrap x86_64 build, gcc.dg/builtins-34.c fails for me with a
> sigsegv.
>
> I've traced it back to this code in insn-emit.c:
> ...
> rtx
> gen_ldexpxf3 (rtx operand0,
> rtx operand1,
> rtx operand2)
> {
>   rtx _val = 0;
>   start_sequence ();
>   {
> rtx operands[3];
> operands[0] = operand0;
> operands[1] = operand1;
> operands[2] = operand2;
>
> {
>   if (optimize_insn_for_size_p ())
> FAIL;
>
>   operands[3] = gen_reg_rtx (XFmode);
>   operands[4] = gen_reg_rtx (XFmode);
> ...
>
> operands is declared with size 3, and operands[3,4] accesses are out of
> bounds.
>
> I've done a minimal build with attached patch, and reran the test-case,
> which passes now.
>
> OK if bootstrap succeeds?
>
> 2014-04-26  Tom de Vries  
>
> * config/i386/i386.md (define_expand "ldexpxf3"): Fix out-of-bounds
> array accesses.

OK for mainline and 4.9 branch.

Thanks,
Uros.


[RFC][ARM] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-04-26 Thread Kugan
Hi,

Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for ARM. With
this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS.

This implementation is based on SPARC and i386 implementations.

Regression tested on qemu-arm for arm-none-linux-gnueabi with no new
regression. Is this OK for trunk?

Thanks,
Kugan

gcc/
+2014-04-27  Kugan Vivekanandarajah  
+
+   * config/arm/arm.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New define.
+   (arm_builtins) : Add ARM_BUILTIN_LDFPSCR and ARM_BUILTIN_STFPSCR.
+   (bdesc_2arg) : Add description for builtins __builtins_arm_stfpscr
+   and __builtins_arm_ldfpscr.
+   (arm_init_builtins) : Initialize builtins __builtins_arm_stfpscr and
+   __builtins_arm_ldfpscr.
+   (arm_expand_builtin) : Expand builtins __builtins_arm_stfpscr and
+   __builtins_arm_ldfpscr.
+   (arm_atomic_assign_expand_fenv): New function.
+   * config/arm/vfp.md (stfpscr): New pattern.
+   (ldfpscr) : Likewise.
+   * config/arm/unspecs.md (unspecv): Add VUNSPEC_LDFPSCR and
+   VUNSPEC_STFPSCR.
+


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0240cc7..4f0ed58 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -59,6 +59,7 @@
 #include "params.h"
 #include "opts.h"
 #include "dumpfile.h"
+#include "gimple-expr.h"
 
 /* Forward definitions of types.  */
 typedef struct minipool_nodeMnode;
@@ -93,6 +94,7 @@ static int thumb_far_jump_used_p (void);
 static bool thumb_force_lr_save (void);
 static unsigned arm_size_return_regs (void);
 static bool arm_assemble_integer (rtx, unsigned int, int);
+static void arm_atomic_assign_expand_fenv (tree *hold, tree *clear, tree 
*update);
 static void arm_print_operand (FILE *, rtx, int);
 static void arm_print_operand_address (FILE *, rtx);
 static bool arm_print_operand_punct_valid_p (unsigned char code);
@@ -584,6 +586,9 @@ static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE arm_mangle_type
 
+#undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
+#define TARGET_ATOMIC_ASSIGN_EXPAND_FENV arm_atomic_assign_expand_fenv
+
 #undef TARGET_BUILD_BUILTIN_VA_LIST
 #define TARGET_BUILD_BUILTIN_VA_LIST arm_build_builtin_va_list
 #undef TARGET_EXPAND_BUILTIN_VA_START
@@ -23212,6 +23217,9 @@ enum arm_builtins
   ARM_BUILTIN_CRC32CH,
   ARM_BUILTIN_CRC32CW,
 
+  ARM_BUILTIN_LDFPSCR,
+  ARM_BUILTIN_STFPSCR,
+
 #undef CRYPTO1
 #undef CRYPTO2
 #undef CRYPTO3
@@ -24010,6 +24018,15 @@ static const struct builtin_description bdesc_2arg[] =
   IWMMXT_BUILTIN2 (iwmmxt_wmacuz, WMACUZ)
   IWMMXT_BUILTIN2 (iwmmxt_wmacsz, WMACSZ)
 
+
+#define FP_BUILTIN(L, U) \
+  {0, CODE_FOR_##L, "__builtin_arm_"#L, ARM_BUILTIN_##U, \
+   UNKNOWN, 0},
+
+  FP_BUILTIN (stfpscr, LDFPSCR)
+  FP_BUILTIN (ldfpscr, STFPSCR)
+#undef FP_BUILTIN
+
 #define CRC32_BUILTIN(L, U) \
   {0, CODE_FOR_##L, "__builtin_arm_"#L, ARM_BUILTIN_##U, \
UNKNOWN, 0},
@@ -24524,6 +24541,21 @@ arm_init_builtins (void)
 
   if (TARGET_CRC32)
 arm_init_crc32_builtins ();
+
+  if (TARGET_VFP)
+{
+  tree ftype_stfpscr
+   = build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_ldfpscr
+   = build_function_type_list (unsigned_type_node, NULL);
+
+  arm_builtin_decls[ARM_BUILTIN_LDFPSCR]
+   = add_builtin_function ("__builtin_arm_ldfscr", ftype_ldfpscr,
+   ARM_BUILTIN_LDFPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  arm_builtin_decls[ARM_BUILTIN_STFPSCR]
+   = add_builtin_function ("__builtin_arm_stfscr", ftype_stfpscr,
+   ARM_BUILTIN_STFPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+}
 }
 
 /* Return the ARM builtin for CODE.  */
@@ -25251,6 +25283,25 @@ arm_expand_builtin (tree exp,
 
   switch (fcode)
 {
+case ARM_BUILTIN_LDFPSCR:
+case ARM_BUILTIN_STFPSCR:
+  if (fcode == ARM_BUILTIN_LDFPSCR)
+   {
+ icode = CODE_FOR_ldfpscr;
+ target = gen_reg_rtx (SImode);
+ pat = GEN_FCN (icode) (target);
+   }
+  else
+   {
+ target = NULL_RTX;
+ icode = CODE_FOR_stfpscr;
+ arg0 = CALL_EXPR_ARG (exp, 0);
+ op0 = expand_normal (arg0);
+ pat = GEN_FCN (icode) (op0);
+   }
+  emit_insn (pat);
+  return target;
+
 case ARM_BUILTIN_TEXTRMSB:
 case ARM_BUILTIN_TEXTRMUB:
 case ARM_BUILTIN_TEXTRMSH:
@@ -31116,4 +31167,70 @@ arm_asan_shadow_offset (void)
   return (unsigned HOST_WIDE_INT) 1 << 29;
 }
 
+static void
+arm_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  if (!TARGET_VFP)
+return;
+
+  const unsigned FE_INVALID = 1;
+  const unsigned FE_DIVBYZERO = 2;
+  const unsigned FE_OVERFLOW = 4;
+  const unsigned FE_UNDERFLOW = 8;
+  const unsigned FE_INEXACT = 16;
+  const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO
+   | FE_OVERFLOW | FE_UNDERFLOW
+   

[RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook

2014-04-26 Thread Kugan
Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64.
With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS.

This implementation is based on SPARC and i386 implementations.

Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new
regression. Is this OK for trunk?

Thanks,
Kugan

gcc/
+2014-04-27  Kugan Vivekanandarajah  
+
+   * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New
+   define.
+   * config/aarch64/aarch64-builtins.c (arm_builtins) : Add
+   AARCH64_BUILTIN_LDFPSCR and AARCH64_BUILTIN_STFPSCR.
+   (aarch64_init_builtins) : Initialize builtins
+   __builtins_aarch64_stfpscr and __builtins_aarch64_ldfpscr.
+   (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_stfpscr
+   and __builtins_aarch64_ldfpscr.
+   (aarch64_atomic_assign_expand_fenv): New function.
+   * config/aarch64/aarch64.md (stfpscr): New pattern.
+   (ldfpscr) : Likewise.
+   (unspecv): Add UNSPECV_LDFPSCR and UNSPECV_STFPSCR.
+





diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 55cfe0a..70d3efa 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -371,6 +371,10 @@ static aarch64_simd_builtin_datum 
aarch64_simd_builtin_data[] = {
 enum aarch64_builtins
 {
   AARCH64_BUILTIN_MIN,
+
+  AARCH64_BUILTIN_LDFPSCR,
+  AARCH64_BUILTIN_STFPSCR,
+
   AARCH64_SIMD_BUILTIN_BASE,
 #include "aarch64-simd-builtins.def"
   AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE
@@ -752,6 +756,18 @@ aarch64_init_simd_builtins (void)
 void
 aarch64_init_builtins (void)
 {
+  tree ftype_stfpscr
+= build_function_type_list (void_type_node, unsigned_type_node, NULL);
+  tree ftype_ldfpscr
+= build_function_type_list (unsigned_type_node, NULL);
+
+  aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR]
+= add_builtin_function ("__builtin_aarch64_ldfscr", ftype_ldfpscr,
+   AARCH64_BUILTIN_LDFPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR]
+= add_builtin_function ("__builtin_aarch64_stfscr", ftype_stfpscr,
+   AARCH64_BUILTIN_STFPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
+
   if (TARGET_SIMD)
 aarch64_init_simd_builtins ();
 }
@@ -964,6 +980,31 @@ aarch64_expand_builtin (tree exp,
 {
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   int fcode = DECL_FUNCTION_CODE (fndecl);
+  int icode;
+  rtx pat, op0;
+  tree arg0;
+
+  switch (fcode)
+{
+case AARCH64_BUILTIN_LDFPSCR:
+case AARCH64_BUILTIN_STFPSCR:
+  if (fcode == AARCH64_BUILTIN_LDFPSCR)
+   {
+ icode = CODE_FOR_ldfpscr;
+ target = gen_reg_rtx (SImode);
+ pat = GEN_FCN (icode) (target);
+   }
+  else
+   {
+ target = NULL_RTX;
+ icode = CODE_FOR_stfpscr;
+ arg0 = CALL_EXPR_ARG (exp, 0);
+ op0 = expand_normal (arg0);
+ pat = GEN_FCN (icode) (op0);
+   }
+  emit_insn (pat);
+  return target;
+}
 
   if (fcode >= AARCH64_SIMD_BUILTIN_BASE)
 return aarch64_simd_expand_builtin (fcode, exp, target);
@@ -1196,6 +1237,70 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return changed;
 }
 
+void
+aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  const unsigned FE_INVALID = 1;
+  const unsigned FE_DIVBYZERO = 2;
+  const unsigned FE_OVERFLOW = 4;
+  const unsigned FE_UNDERFLOW = 8;
+  const unsigned FE_INEXACT = 16;
+  const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO
+   | FE_OVERFLOW | FE_UNDERFLOW
+   | FE_INEXACT);
+  const unsigned HOST_WIDE_INT FE_EXCEPT_SHIFT = 8;
+
+  /* Genareate the equivalence of :
+   unsigned int fenv_var;
+   fenv_var = __builtin_aarch64_ldfpscr ();
+
+   unsigned int masked_fenv;
+   tmp1_var = fenv_var & ~ mask;
+
+   __builtin_aarch64_fpscr (&tmp1_var);  */
+
+  tree fenv_var = create_tmp_var (unsigned_type_node, NULL);
+  tree ldfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_LDFPSCR];
+  tree stfpscr = aarch64_builtin_decls[AARCH64_BUILTIN_STFPSCR];
+  tree mask = build_int_cst (unsigned_type_node,
+~((FE_ALL_EXCEPT << FE_EXCEPT_SHIFT)
+  | FE_ALL_EXCEPT));
+  tree ld_fenv_stmt = build2 (MODIFY_EXPR, unsigned_type_node,
+ fenv_var, build_call_expr (ldfpscr, 0));
+  tree masked_fenv = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var, mask);
+  tree hold_fnclex = build_call_expr (stfpscr, 1, masked_fenv);
+  *hold = build2 (COMPOUND_EXPR, void_type_node,
+ build2 (COMPOUND_EXPR, void_type_node, masked_fenv,
+ ld_fenv_stmt), hold_fnclex);
+
+  /* Store the value of masked_fenv to clear the exceptions:
+ __builtin_aarch64_stfpscr (masked_fenv);  */
+
+  *cl

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-04-26 Thread Svante Signell
±On Sat, 2014-04-26 at 08:53 +0200, Justus Winter wrote:
> Quoting Svante Signell (2014-04-24 10:39:10)
> > On Fri, 2014-04-18 at 10:03 +0200, Samuel Thibault wrote:
> > > Samuel Thibault, le Thu 17 Apr 2014 00:03:45 +0200, a écrit :
> > > > Thomas Schwinge, le Wed 09 Apr 2014 09:36:42 +0200, a écrit :
> > > > > Well, the first step is to verify that 
> > > > > TARGET_THREAD_SPLIT_STACK_OFFSET
> > > > > and similar configury is correct for the Hurd,
> > > > 
> > > > I have added the corresponding field, so we can just use the same offset
> > > > as on Linux.
> > > 
> > > I have uploaded packages on http://people.debian.org/~sthibault/tmp/ so
> > > Svante can try setting TARGET_THREAD_SPLIT_STACK_OFFSET to 0x30 with
> > > them.
> > 
> > Status report:
> > - Without split stack enabled around 70 libgo tests pass and 50 fails,
> > most of them with a segfault.
> > - Enabling split stack and using the libc Samuel built all 122 libgo
> > tests fail with a segfault.
> > - In both cases simple go programs work, like hello+sqrt.go below.
> > - The segfault seems to be located at the same code piece according to
> > gdb (maybe due to exception handling)
> > 
> > cat hello+sqrt.go
> > package main
> > import (
> > "fmt"
> > )
> > func main() {
> > fmt.Printf("Hello, world.  Sqrt(2) = %v\n", Sqrt(2))
> > }
> 
> How is that even a valid go program? Sqrt is not defined.

Here is the sqrt.go:
// Package main is a trivial example package.
package main

// Sqrt returns an approximation to the square root of x.
func Sqrt(x float64) float64 {
z := 1.0
for i := 0; i < 1000; i++ {
z -= (z*z - x) / (2 * z)
}
return z
}
gccgo-4.9 -g hello+sqrt.go sqrt.go

> > I have not been able to use a local go library function, e.g. package
> > newmath, and the go frontend is not yet available for GNU/Hurd.
> 
> What do you mean exactly by "local go library function"?

Like having sqrt.go in a separate directory:

cat newmath/sqrt.go
// Package newmath is a trivial example package.
package newmath

// Sqrt returns an approximation to the square root of x.
func Sqrt(x float64) float64 {
z := 1.0
for i := 0; i < 1000; i++ {
z -= (z*z - x) / (2 * z)
}
return z
}

> > However, it seems that something triggers the segfaults when running
> > make -C build/i486-gnu/libgo check (both with and w/o split-stack)
> > while setting the keep parameter in ./src/libgo/testsuite/gotest
> > and running them manually some of them work?? As a first glance, about
> > the same number of tests succeeds with and w/o split stack :) Some of
> > the failing tests still seems random, sometimes they pass, sometimes
> > they fail.
> 
> For reference, here are my notes about one of these crashes (Svante,
> is this still current?):

Yes it is, thanks for your help so far. Is the rpctrace bug you
mentioned that the wrong ports are reported?

> ~~~ snip ~~~
> First, there is a rpctrace bug (or, i'm misinterpreting the output):
> 
>   93<--142(pid1182)->dir_lookup ("etc/hostname" 1 0) = 0 1 ""
> 158<--160(pid1182)
> 
> Here, we do a dir_lookup and get port 158.
> 
>   158<--160(pid1182)->io_read_request (-1 255) = 0 "hurd-2013\n"
>   158<--160(pid1182)->io_readable_request () = 0 0
> 
> Here, we use it to do stuff with that file.
> 
> task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0 
> 
> Here, we deallocate the port.  Note how the port name (pn?) says 23,
> even though it's clearly port 158 that is getting deallocated, b/c we
> get port 158 back from the next rpc:
> 
>   93<--142(pid1182)->dir_lookup ("lib/i386-gnu/libnss_files.so.2" 4194305 0) 
> = 0 1 ""158<--157(pid1182)
> 
> Now, the get to the real issue.  From the backtrace 
> (http://paste.debian.net/95410/)
> we know that it segfaults in mmap:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132
> 132  intr-msg.c: No such file or directory.
> [...]
> Thread 4 (Thread 1205.4):
> #0  0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132
> err = 
> err = 
> user_option = 3
> user_timeout = 48
> m = 0x532370
> msgh_bits = 0
> remote_port = 268509186
> msgid = 21118
> save_data = 
> __PRETTY_FUNCTION__ = "_hurd_intr_rpc_mach_msg"
> #1  0x0005 in ?? ()
> No symbol table info available.
> #2  0x01a7a8dd in __mmap (addr=0x0, len=49880, prot=5, flags=33, fd=8, 
> offset=0)
> at ../sysdeps/mach/hurd/mmap.c:92
> __ulink = {resource = {next = 0x0, prevp = 0x2cfcc}, thread = {next = 
> 0x0, 
> prevp = 0x1b81c5c}, cleanup = 0x19a2c70 <_hurd_port_cleanup>, 
> cleanup_data = 0x99}
> __ctty_ulink = {resource = {next = 0x0, prevp = 0x19fc6bc 
> <_int_malloc+12>}, thread = {
> next = 0x17, prevp = 0x5}, cleanup = 0x0, cleanup_data = 0x700f2}
> __result = 
> descriptor = 0x1b5e467 <__io_map+7>

Re: -fuse-caller-save - Enable for MIPS

2014-04-26 Thread Tom de Vries

On 25-04-14 15:22, Richard Sandiford wrote:

Tom de Vries  writes:

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 45256e9..b61cd44 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -7027,11 +7027,17 @@ mips_expand_call (enum mips_call_type type, rtx result, 
rtx addr,
  {
rtx orig_addr, pattern, insn;
int fp_code;
+  rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);

fp_code = aux == 0 ? 0 : (int) GET_MODE (aux);
insn = mips16_build_call_stub (result, &addr, args_size, fp_code);
if (insn)
  {
+  if (TARGET_EXPLICIT_RELOCS
+ && TARGET_CALL_CLOBBERED_GP
+ && !find_reg_note (insn, REG_NORETURN, 0))
+   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+


I think this condition should go in mips_emit_call_insn instead,
so that we don't have 4 instances of it.  untyped_call could then
use mips_expand_call as well.



Richard,

Done.


Until now there was no real downside to using $6 for non-MIPS16 code.
Now that there is, it would probably be worth making it:

+#define POST_CALL_TMP_REG \
   (TARGET_MIPS16 ? GP_ARG_FIRST + 2 : PIC_OFFSET_TABLE_REGNUM)

and only adding the clobber for MIPS16.



Done.


diff --git a/gcc/testsuite/gcc.target/mips/fuse-caller-save.c 
b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
new file mode 100644
index 000..1fd6c7d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/fuse-caller-save.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fuse-caller-save" } */
+/* { dg-skip-if "" { *-*-* }  { "*" } { "-Os" } } */


I might have asked this before, sorry, but why this skip?  Please add a brief
comment (in the string, if short enough).


I've reduced the amount of skips a bit, and added a comment why they are 
skipped.




+/* Testing -fuse-caller-save optimization option.  */
+
+static int __attribute__((noinline)) NOCOMPRESSION
+bar (int x)
+{
+  return x + 3;
+}
+
+int __attribute__((noinline)) NOCOMPRESSION
+foo (int y)
+{
+  return y + bar (y);
+}
+
+int NOCOMPRESSION
+main (void)
+{
+  return !(foo (5) == 13);
+}
+
+/* Check that there are only 2 stack-saves: r31 in main and foo.  */
+
+/* Check that there only 2 sw/sd.  */
+/* { dg-final { scan-assembler-times "(?n)s\[wd\]\t\\\$.*,.*\\(\\\$sp\\)" 2 } 
} */
+
+/* Check that the first caller-save register is unused.  */
+/* { dg-final { scan-assembler-not "\\\$16" } } */


It'd be good to avoid NOCOMPRESSION because the only case that really
needs the temporary register is MIPS16.  Please try putting the test
in a header file and reusing it for three tests, one each of MIPS16,
microMIPS and uncompressed.



Done, I think, I'm not 100% sure I understood what you wanted me to do here.

build and reg-tested on MIPS.

OK for trunk?

Thanks,
- Tom

2014-01-12  Radovan Obradovic  
Tom de Vries  

	* config/mips/mips-protos.h (mips_emit_call_insn): Declare.
	* config/mips/mips.h (POST_CALL_TMP_REG): Define.
	* config/mips/mips.c (mips_emit_call_insn): Remove static.  Use
	last_call_insn.  Add POST_CALL_TMP_REG clobber
	 (mips_split_call): Use POST_CALL_TMP_REG.
	(TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS): Redefine to true.
	* config/mips/mips.md (define_expand "untyped_call"): Use
	mips_emit_call_insn.

	* gcc.target/mips/mips.exp: Add use-caller-save to -ffoo/-fno-foo
	options.
	* gcc.target/mips/fuse-caller-save.h: New include file.
	* gcc.target/mips/fuse-caller-save.c: New test.
	* gcc.target/mips/fuse-caller-save-mips16.c: Same.
	* gcc.target/mips/fuse-caller-save-micromips.c: Same.
---
 gcc/config/mips/mips-protos.h  |  1 +
 gcc/config/mips/mips.c | 24 --
 gcc/config/mips/mips.h |  7 +++
 gcc/config/mips/mips.md|  4 +++-
 .../gcc.target/mips/fuse-caller-save-micromips.c   | 17 +++
 .../gcc.target/mips/fuse-caller-save-mip16.c   | 17 +++
 gcc/testsuite/gcc.target/mips/fuse-caller-save.c   | 17 +++
 gcc/testsuite/gcc.target/mips/fuse-caller-save.h   | 17 +++
 gcc/testsuite/gcc.target/mips/mips.exp |  1 +
 9 files changed, 98 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save-micromips.c
 create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save-mip16.c
 create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save.c
 create mode 100644 gcc/testsuite/gcc.target/mips/fuse-caller-save.h

diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 3d59b7b..19ea919 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -232,6 +232,7 @@ extern bool mips_use_pic_fn_addr_reg_p (const_rtx);
 extern rtx mips_expand_call (enum mips_call_type, rtx, rtx, rtx, rtx, bool);
 extern void mips_split_call (rtx, rtx);
 extern bool mips_get_pic_call_symbol (rtx *, int);
+extern rtx mips_emit_call_insn (

[PATCH, testsuite]: Clean dump files

2014-04-26 Thread Uros Bizjak
Hello!

2014-04-26  Uros Bizjak  

* gcc.dg/tree-ssa/alias-30.c (dg-options): Dump only fre1 details.
* gcc.dg/vect/pr60505.c: Cleanup vect tree dump.
* g++.dg/ipa/devirt-27.C (dg-options): Remove -fdump-ipa-devirt.

Committed to mainline and 4.9 branch.

Uros.
Index: gcc.dg/tree-ssa/alias-30.c
===
--- gcc.dg/tree-ssa/alias-30.c  (revision 209806)
+++ gcc.dg/tree-ssa/alias-30.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre-details" } */
+/* { dg-options "-O -fdump-tree-fre1-details" } */
 
 extern int posix_memalign(void **memptr,
  __SIZE_TYPE__ alignment, __SIZE_TYPE__ size);
Index: gcc.dg/vect/pr60505.c
===
--- gcc.dg/vect/pr60505.c   (revision 209806)
+++ gcc.dg/vect/pr60505.c   (working copy)
@@ -10,3 +10,5 @@
 out[i] = (ovec[i] = in[i]);
   out[num] = ovec[num/2];
 }
+
+/* { dg-final { cleanup-tree-dump "vect" } } */
Index: g++.dg/ipa/devirt-27.C
===
--- g++.dg/ipa/devirt-27.C  (revision 209806)
+++ g++.dg/ipa/devirt-27.C  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-ipa-devirt -fdump-tree-optimized"  } */
+/* { dg-options "-O3 -fdump-tree-optimized"  } */
 struct A
  {
int a;


Re: [wide-int 5/5] Add dump () method for gdb debugging

2014-04-26 Thread Kenneth Zadeck
i am sorry, i missed the fact that the loop counts up but you were 
reversing the order in the indexes.


kenny
On 04/26/2014 04:26 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

don't you think that it would be easier to understand the number if you
printed it largest index first, as in the routines in wide-int-print.cc?

Yeah, that's what the patch does.  E.g. (for 32-bit HWI):

   [...,0x3,0x8000]

is 7 << 31.

Thanks,
Richard


kenny
On 04/25/2014 09:58 AM, Richard Sandiford wrote:

This patch adds a dump () method so that it's easier to read the
contents of the various wide-int types in gdb.  I've deliberately not
done any extension for "small_prec" cases because I think here
we want to see the raw values as much as possible.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2014-04-25 14:48:03.263213403 +0100
+++ gcc/wide-int.cc 2014-04-25 14:48:25.186333842 +0100
@@ -2130,3 +2130,9 @@ wi::only_sign_bit_p (const wide_int_ref
   void gt_ggc_mx (widest_int *) { }
   void gt_pch_nx (widest_int *, void (*) (void *, void *), void *) { }
   void gt_pch_nx (widest_int *) { }
+
+template void wide_int::dump () const;
+template void generic_wide_int  >::dump () const;
+template void generic_wide_int  >::dump () const;
+template void offset_int::dump () const;
+template void widest_int::dump () const;
Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2014-04-25 14:34:54.823652619 +0100
+++ gcc/wide-int.h  2014-04-25 14:48:06.218229309 +0100
@@ -709,6 +709,9 @@ #define INCDEC_OPERATOR(OP, DELTA) \
   #undef ASSIGNMENT_OPERATOR
   #undef INCDEC_OPERATOR
   
+  /* Debugging functions.  */

+  void dump () const;
+
 static const bool is_sign_extended
   = wi::int_traits  >::is_sign_extended;
   };
@@ -859,6 +862,23 @@ generic_wide_int ::operator = (
 return *this;
   }
   
+/* Dump the contents of the integer to stderr, for debugging.  */

+template 
+void
+generic_wide_int ::dump () const
+{
+  unsigned int len = this->get_len ();
+  const HOST_WIDE_INT *val = this->get_val ();
+  unsigned int precision = this->get_precision ();
+  fprintf (stderr, "[");
+  if (len * HOST_BITS_PER_WIDE_INT < precision)
+fprintf (stderr, "...,");
+  for (unsigned int i = 0; i < len - 1; ++i)
+fprintf (stderr, HOST_WIDE_INT_PRINT_HEX ",", val[len - 1 - i]);
+  fprintf (stderr, HOST_WIDE_INT_PRINT_HEX "], precision = %d\n",
+  val[0], precision);
+}
+
   namespace wi
   {
 template <>




Re: [wide-int 4/5] Fix canonize handling of "small_prec" case

2014-04-26 Thread Kenneth Zadeck

this is fine.

kenny
On 04/25/2014 09:44 AM, Richard Sandiford wrote:

We should write back the sign-extended value.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc 2014-04-25 09:15:14.297359380 +0100
+++ gcc/wide-int.cc 2014-04-25 09:15:50.755637670 +0100
@@ -81,7 +81,7 @@ canonize (HOST_WIDE_INT *val, unsigned i
  
top = val[len - 1];

if (len * HOST_BITS_PER_WIDE_INT > precision)
-top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT);
+val[len - 1] = top = sext_hwi (top, precision % HOST_BITS_PER_WIDE_INT);
if (top != 0 && top != (HOST_WIDE_INT)-1)
  return len;
  




Re: -fuse-caller-save - Collect register usage information

2014-04-26 Thread Tom de Vries

Eric,
Honza,

This patch adds analysis in pass_final to track which hard registers are set or
clobbered by the function body, and stores that information in a
struct cgraph_node, to be used in the fuse-caller-save optmization.

This is the updated version of the previously approved patch
submitted here (http://gcc.gnu.org/ml/gcc-patches/2013-03/msg01320.html ).
The changes are:
- using a new hook call_fusage_contains_non_callee_clobbers,
- incorporating minor review comments from Richard Sandiford
  ( http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01436.html ).

As part of the fuse-caller-save patch series, bootstrapped and reg-tested on 
x86_64, and build and reg-tested on MIPS.


Eric, non-cgraph part OK for trunk?

Honza, cgraph part OK for trunk?

Thanks,
- Tom
2013-04-29  Radovan Obradovic  
Tom de Vries  

	* cgraph.h (struct cgraph_node): Add function_used_regs,
	function_used_regs_initialized and function_used_regs_valid fields.
	* final.c: Move include of hard-reg-set.h to before rtl.h to declare
	find_all_hard_reg_sets.
	(collect_fn_hard_reg_usage, get_call_fndecl, get_call_cgraph_node)
	(get_call_reg_set_usage): New function.
	(rest_of_handle_final): Use collect_fn_hard_reg_usage.
---
 gcc/cgraph.h |   7 
 gcc/final.c  | 117 ++-
 gcc/regs.h   |   4 ++
 3 files changed, 127 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 84fc1d9..d1f 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -408,6 +408,13 @@ public:
   /* Time profiler: first run of function.  */
   int tp_first_run;
 
+  /* Call unsaved hard registers really used by the corresponding
+ function (including ones used by functions called by the
+ function).  */
+  HARD_REG_SET function_used_regs;
+  /* Set if function_used_regs is valid.  */
+  unsigned function_used_regs_valid: 1;
+
   /* Set when decl is an abstract function pointed to by the
  ABSTRACT_DECL_ORIGIN of a reachable function.  */
   unsigned used_as_abstract_origin : 1;
diff --git a/gcc/final.c b/gcc/final.c
index 8c6f6ee..7b79059 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "tree.h"
 #include "varasm.h"
+#include "hard-reg-set.h"
 #include "rtl.h"
 #include "tm_p.h"
 #include "regs.h"
@@ -57,7 +58,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "recog.h"
 #include "conditions.h"
 #include "flags.h"
-#include "hard-reg-set.h"
 #include "output.h"
 #include "except.h"
 #include "function.h"
@@ -223,6 +223,7 @@ static int alter_cond (rtx);
 static int final_addr_vec_align (rtx);
 #endif
 static int align_fuzz (rtx, rtx, int, unsigned);
+static void collect_fn_hard_reg_usage (void);
 
 /* Initialize data in final at the beginning of a compilation.  */
 
@@ -4426,6 +4427,7 @@ rest_of_handle_final (void)
   assemble_start_function (current_function_decl, fnname);
   final_start_function (get_insns (), asm_out_file, optimize);
   final (get_insns (), asm_out_file, optimize);
+  collect_fn_hard_reg_usage ();
   final_end_function ();
 
   /* The IA-64 ".handlerdata" directive must be issued before the ".endp"
@@ -4724,3 +4726,116 @@ make_pass_clean_state (gcc::context *ctxt)
 {
   return new pass_clean_state (ctxt);
 }
+
+/* Collect hard register usage for the current function.  */
+
+static void
+collect_fn_hard_reg_usage (void)
+{
+  rtx insn;
+  int i;
+  struct cgraph_node *node;
+
+  if (!flag_use_caller_save
+  || !targetm.call_fusage_contains_non_callee_clobbers)
+return;
+
+  node = cgraph_get_node (current_function_decl);
+  gcc_assert (node != NULL);
+
+  for (insn = get_insns (); insn != NULL_RTX; insn = next_insn (insn))
+{
+  HARD_REG_SET insn_used_regs;
+
+  if (!NONDEBUG_INSN_P (insn))
+	continue;
+
+  find_all_hard_reg_sets (insn, &insn_used_regs, false);
+
+  if (CALL_P (insn)
+	  && !get_call_reg_set_usage (insn, &insn_used_regs, call_used_reg_set))
+	{
+	  CLEAR_HARD_REG_SET (node->function_used_regs);
+	  return;
+	}
+
+  IOR_HARD_REG_SET (node->function_used_regs, insn_used_regs);
+}
+
+  /* Be conservative - mark fixed and global registers as used.  */
+  IOR_HARD_REG_SET (node->function_used_regs, fixed_reg_set);
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+if (global_regs[i])
+  SET_HARD_REG_BIT (node->function_used_regs, i);
+
+#ifdef STACK_REGS
+  /* Handle STACK_REGS conservatively, since the df-framework does not
+ provide accurate information for them.  */
+
+  for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
+SET_HARD_REG_BIT (node->function_used_regs, i);
+#endif
+
+  node->function_used_regs_valid = 1;
+}
+
+/* Get the declaration of the function called by INSN.  */
+
+static tree
+get_call_fndecl (rtx insn)
+{
+  rtx note, datum;
+
+  if (!flag_use_caller_save)
+return NULL_TREE;
+
+  note = find_reg_note (insn, REG_CALL_DECL, NULL_RTX);
+  if (note == NULL_RTX)
+return NULL_

Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.

2014-04-26 Thread Eric Botcazou
> > 2014-03-21  James Greenhalgh  
> > 
> > * calls.c (initialize_argument_information): Always treat
> > PUSH_ARGS_REVERSED as 1, simplify code accordingly.
> > (expand_call): Likewise.
> > (emit_library_call_calue_1): Likewise.
> > * expr.c (PUSH_ARGS_REVERSED): Do not define.
> > (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify
> > code accordingly.
> 
> This is fine for the trunk at this point.

Are you sure that it's not a correctness issue for some targets though?

-- 
Eric Botcazou


Re: [SPARC] Fix PR target/60941

2014-04-26 Thread Eric Botcazou
> > Not clear to me, (2U << i) should be zero if the shift count is masked.
> 
> 2U << 31 is undefined behavior on those targets.

Precisely not, or else we are not talking about the same notion of masking.

-- 
Eric Botcazou


Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-04-26 Thread Justus Winter
Quoting Svante Signell (2014-04-26 13:59:57)
> > For reference, here are my notes about one of these crashes (Svante,
> > is this still current?):
> 
> Yes it is, thanks for your help so far. Is the rpctrace bug you
> mentioned that the wrong ports are reported?
> 
> > ~~~ snip ~~~
> > [...]
> > task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0 
> > 
> > Here, we deallocate the port.  Note how the port name (pn?) says 23,
> > even though it's clearly port 158 that is getting deallocated [...]

Reading the source revealed that 'pn{ 23}' makes it explicit that this
is a port name, not a right.  And indeed:

routine mach_port_deallocate(
task: ipc_space_t;
name: mach_port_name_t);

For port rights, which we know are valid in the tracee, we translate
the name so that the name used by the tracee is displayed in the log.
A comment in the source correctly states:

/* Note that MACH_MSG_TYPE_PORT_NAME does not indicate a port right.
   It indicates a port name, i.e. just an integer--and we don't know
   what task that port name is meaningful in.  If it's meaningful in a
   traced task, then it refers to our intercepting port rather than
   the original port anyway.  */

We could try to be clever here, at least for the reference counting
RPC.

Justus


Re: [SPARC] Fix PR target/60941

2014-04-26 Thread Mikael Pettersson
Eric Botcazou writes:
 > > > Not clear to me, (2U << i) should be zero if the shift count is masked.
 > > 
 > > 2U << 31 is undefined behavior on those targets.
 > 
 > Precisely not, or else we are not talking about the same notion of masking.

I believe Jakub is referring to the following in the C standard:

"Bitwise shift operators
...
Semantics
...  If the value of the right operand ... is greater than or equal to the
width of the promoted left operand, the behavior is undefined."

So on 16-bit int systems you can't portably shift 2U by more than 15.

(I'm not worried about the HW blowing up, but GCC itself has a long history
of exploiting no-undefinedness assumptions.)

/Mikael


[PATCH] doc/install.texi: Parallel profiledbootstrap is supported on all maintained releases.

2014-04-26 Thread Markus Trippelsdorf
Parallel profiledbootstrap is supported on all maintained releases. So
just remove a misleading outdated sentence, that states the opposite,
from doc/install.texi. 

OK for trunk?

Thanks.

2014-04-26  Markus Trippelsdorf  

* doc/install.texi (Building with profile feedback): Remove
outdated sentence.

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 0b2b36575775..7851b0061b30 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2544,8 +2544,7 @@ Finally a @code{stagefeedback} compiler is built using 
the information collected
 
 Unlike standard bootstrap, several additional restrictions apply.  The
 compiler used to build @code{stage1} needs to support a 64-bit integral type.
-It is recommended to only use GCC for this.  Also parallel make is currently
-not supported since collisions in profile collecting may occur.
+It is recommended to only use GCC for this.
 
 @html
 
-- 
Markus


Re: [SPARC] Fix PR target/60941

2014-04-26 Thread Jakub Jelinek
On Sat, Apr 26, 2014 at 03:30:25PM +0200, Eric Botcazou wrote:
> > > Not clear to me, (2U << i) should be zero if the shift count is masked.
> > 
> > 2U << 31 is undefined behavior on those targets.
> 
> Precisely not, or else we are not talking about the same notion of masking.

Eh, C99, 6.5.7/3:
"If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the
behavior is undefined."

So, if you have int16 target, where unsigned int is 16-bit and UINT_MAX 65535,
then shift count must be >= 0 and < 16, therefore, 2U << 31 is undefined
behavior.

Jakub


Re: [Patch 1/3] Use manual regex algorithm switching

2014-04-26 Thread Jonathan Wakely

On 25/04/14 20:50 -0400, Tim Shen wrote:


* include/bits/regex.tcc (__regex_algo_impl<>): Remove
_GLIBCXX_REGEX_DFS_QUANTIFIERS_LIMIT and use
_GLIBCXX_REGEX_USE_THOMPSON_NFA instead.
* include/bits/regex_automaton.h: Remove quantifier counting variable.
* include/bits/regex_automaton.tcc (_State_base::_M_dot):
Adjust debug NFA dump.


This patch is OK, thanks very much.



Re: [COMMITTED] Fix debug/60438 -- i686 stack vs fp operations

2014-04-26 Thread Tom de Vries

OK if bootstrap succeeds?


With testing of the bootstrap build of the patch, I ran into the following 
regression compared to a reference bootstrap build without the patch:

...
FAIL: g++.dg/tsan/cond_race.C  -O2  output pattern test, is ==3087==WARNING: 
Program is run with unlimited stack size, which wouldn't work with Threa\

dSanitizer.
==3087==Re-execing with stack size limited to 33554432 bytes.
==
WARNING: ThreadSanitizer: heap-use-after-free (pid=3087)
  Read of size 8 at 0x7d18efc8 by thread T1:
#0 pthread_cond_signal 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.i\

nc:2266 (libtsan.so.0+0x00039b21)
#1 thr(void*) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 
(cond_race.exe+0x1033\

)
  Previous write of size 8 at 0x7d18efc8 by main thread:
#0 operator delete(void*) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:592 
(libtsan.so.0+0\

x000494b0)
#1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:34 
(cond_race.exe+0x0ea0)

  Location is heap block of size 96 at 0x7d18efa0 allocated by main thread:
#0 operator new(unsigned long) 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:560 
(libtsan.s\

o.0+0x000496f2)
#1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:25 
(cond_race.exe+0x0e12)

  Thread T1 (tid=3101, running) created by main thread at:
#0 pthread_create 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/libsanitizer/tsan/tsan_interceptors.cc:877 
(libtsan.so.0+0x000\

47c23)
#1 main 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:29 
(cond_race.exe+0x0e5a)
SUMMARY: ThreadSanitizer: heap-use-after-free 
/home/vries/gcc_versions/data/test-fix-expand-ldexp/with/src/gcc/testsuite/g++.dg/tsan/cond_race.C:20 
t\

hr(void*)
==
ThreadSanitizer: reported 1 warnings
, should match ThreadSanitizer: data race.*pthread_cond_signal.*
...

I've found the same failure here: 
http://gcc.gnu.org/ml/gcc-testresults/2014-01/msg00127.html, so I'm assuming 
it's a spurious failure.


I've committed to trunk and 4.9.

Thanks,
- Tom


Re: [Patch 2/3] Make regex's infinite loop detection accurate

2014-04-26 Thread Jonathan Wakely

On 25/04/14 18:03 -0400, Tim Shen wrote:


* include/bits/regex_executor.h: Add _M_rep_count to track how
many times this repeat node are visited.


"is visited" not "are visited"


@@ -151,6 +156,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // character.
  std::unique_ptr>> _M_match_queue;
  // Used in BFS, indicating that which state is already visited.
+  std::vector>   _M_rep_count;


We don't need the std:: qualifiers here, but it doesn't do any harm.


  std::unique_ptr> _M_visited;
  _FlagT_M_flags;
  // To record current solution.
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 68a5e04..bb10a72 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -161,6 +161,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return false;
}

+  // __rep_count records how many times (__rep_count.second)
+  // this node are visited under certain input iterator
+  // (__rep_count.first). This prevent the executor from entering
+  // infinite loop by refusing to continue when it's already been
+  // visited more than twice. It's `twice` instead of `once` because
+  // we need to spare one more time for potential group capture.
+  template
+  template
+void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
+_M_rep_once_more(_StateIdT __i)
+{
+  const auto& __state = _M_nfa[__i];
+  auto& __rep_count = _M_rep_count[__i];
+  if (__rep_count.second == 0 || __rep_count.first != _M_current)
+   {
+ auto back = __rep_count;


This variable should be uglified.


+ __rep_count.first = _M_current;
+ __rep_count.second = 1;


Maybe a dumb question (I don't understand a lot of the regex code!)
but is it correct to set this to 1 in the case where
__rep_count.first != _M_current ?  Could that result in the count
going downwards from 2 to 1? (Maybe that's correct?)




Re: [SPARC] Fix PR target/60941

2014-04-26 Thread Eric Botcazou
> I believe Jakub is referring to the following in the C standard:
> 
> "Bitwise shift operators
> ...
> Semantics
> ...  If the value of the right operand ... is greater than or equal to the
> width of the promoted left operand, the behavior is undefined."
> 
> So on 16-bit int systems you can't portably shift 2U by more than 15.

Sure, but we're talking about targets for which the shift count is masked, not 
about the C standard.

-- 
Eric Botcazou


Re: [SPARC] Fix PR target/60941

2014-04-26 Thread Eric Botcazou
> So, if you have int16 target, where unsigned int is 16-bit and UINT_MAX
> 65535, then shift count must be >= 0 and < 16, therefore, 2U << 31 is
> undefined behavior.

Well, if the shift count is masked by the target, if will be masked according 
to the width of the register and the result will nevertheless be zero.

-- 
Eric Botcazou


Re: [PATCH] doc/install.texi: Parallel profiledbootstrap is supported on all maintained releases.

2014-04-26 Thread Gerald Pfeifer
On Sat, 26 Apr 2014, Markus Trippelsdorf wrote:
> Parallel profiledbootstrap is supported on all maintained releases. So
> just remove a misleading outdated sentence, that states the opposite,
> from doc/install.texi. 
> 
> OK for trunk?

Okay.  If this also applies to GCC 4.9 (which I think is the case),
you may want to apply it to the 4.9 branch as well.

Gerald


Re: Update doc/gimple.texi to reflect change from union to class hierarchy

2014-04-26 Thread Gerald Pfeifer
On Fri, 25 Apr 2014, David Malcolm wrote:
> Successfully generates HTML, info and pdf via appropriate make
> invocations; example of resulting HTML can be seen at the bottom of:
> 
> http://dmalcolm.fedorapeople.org/gcc/2014-04-25/Tuple-representation.html
> 
> The diagram is split over pages 178-180 of the generated PDF:
> 
> http://dmalcolm.fedorapeople.org/gcc/2014-04-25/gccint.pdf

When I just checked that PDF did not (yet?) have that table.

> but I don't think there's a good way to avoid page breaks there.

Agreed.

> Bootstrap in progress [do pure doc fixes require a bootstrap?]

Not generally, no.

> OK for trunk, assuming bootstraps? (and eventually for the 4.9 branch,
> after a few days?)

Yes and yes.

(I did not review the technical accuracy of the diagram in detail, but 
(a) micromanagement is not helpful and (b) the current documentation 
is plain wrong.)


The one thing that confused me a bit at first was the line
  +- gimple_statement_base
where I then thought all the others should move their indentation
left by four or so columns.  Which is not what you are trying to
state in fact.  How about omitting the "+-" from that one?

Gerald


Re: Reduce -flto -fprofile-generate memory use

2014-04-26 Thread Gerald Pfeifer
On Thu, 17 Apr 2014, Jan Hubicka wrote:
> + * opts.c (common_handle_option): Disable -fipa-reference coorectly
> + with -fuse-profile.

coorectly -> correctly

Gerald


[C PATCH] proposal to add new warning -Wsizeof-array-argument

2014-04-26 Thread Prathamesh Kulkarni
Hi,
Shall it a good idea to add new warning -Wsizeof-array-argument that
warns when sizeof is applied on parameter declared as an array ?
Similar to clang's -Wsizeof-array-argument:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20110613/042812.html
This was also reported as PR6940:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6940

I have attached a patch that adds the warning to C front-end.
I implemented it by adding a new member BOOL_BITFIELD is_array_parm to
tree_parm_decl. Not sure if that's a good approach.
Also should it be enabled by -Wall or -Wextra or only by
-Wsizeof-array-argument ?
Currently I have made it enabled by -Wall along with -Wsizeof-array-argument.

Bootstrapped on x86_64-unknown-linux-gnu.

[gcc]
* tree-core.h (tree_parm_decl): Add new member BOOL_BITFIELD is_array_parm
* tree.h (SET_PARM_DECL_IS_ARRAY): New macro.
(PARM_DECL_ARRAY_P): New macro.

[gcc/c]
* c-decl.c (push_parm_decl): Call SET_PARM_DECL_IS_ARRAY.
* c-typeck.c (c_expr_sizeof_expr): Add check for sizeof-array-argument warning.

[gcc/c-family]
* c.opt (-Wsizeof-array-argument): New option.

[gcc/testsuite/gcc.dg]
* sizeof-array-argument.c: New test-case.

Thanks and Regards,
Prathamesh
Index: gcc/tree-core.h
===
--- gcc/tree-core.h	(revision 209800)
+++ gcc/tree-core.h	(working copy)
@@ -1411,6 +1411,7 @@ struct GTY(()) tree_const_decl {
 struct GTY(()) tree_parm_decl {
   struct tree_decl_with_rtl common;
   rtx incoming_rtl;
+  BOOL_BITFIELD is_array_parm;
 };
 
 struct GTY(()) tree_decl_with_vis {
Index: gcc/tree.h
===
--- gcc/tree.h	(revision 209800)
+++ gcc/tree.h	(working copy)
@@ -1742,6 +1742,7 @@ extern void protected_set_expr_location
 #define TYPE_LANG_SPECIFIC(NODE) \
   (TYPE_CHECK (NODE)->type_with_lang_specific.lang_specific)
 
+
 #define TYPE_VALUES(NODE) (ENUMERAL_TYPE_CHECK (NODE)->type_non_common.values)
 #define TYPE_DOMAIN(NODE) (ARRAY_TYPE_CHECK (NODE)->type_non_common.values)
 #define TYPE_FIELDS(NODE) \
@@ -2258,6 +2259,12 @@ extern void decl_value_expr_insert (tree
 #define DECL_INCOMING_RTL(NODE) \
   (PARM_DECL_CHECK (NODE)->parm_decl.incoming_rtl)
 
+#define SET_PARM_DECL_IS_ARRAY(NODE, val) \
+  (PARM_DECL_CHECK (NODE)->parm_decl.is_array_parm = (val))
+
+#define PARM_DECL_ARRAY_P(NODE) \
+  (PARM_DECL_CHECK (NODE)->parm_decl.is_array_parm != 0)
+
 /* Nonzero for a given ..._DECL node means that no warnings should be
generated just because this node is unused.  */
 #define DECL_IN_SYSTEM_HEADER(NODE) \
Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c	(revision 209800)
+++ gcc/c/c-decl.c	(working copy)
@@ -4626,13 +4626,14 @@ push_parm_decl (const struct c_parm *par
 {
   tree attrs = parm->attrs;
   tree decl;
+  int is_array;
 
   decl = grokdeclarator (parm->declarator, parm->specs, PARM, false, NULL,
 			 &attrs, expr, NULL, DEPRECATED_NORMAL);
   decl_attributes (&decl, attrs, 0);
-
+  is_array = parm->declarator->kind == cdk_array;
+  SET_PARM_DECL_IS_ARRAY (decl, is_array); 
   decl = pushdecl (decl);
-
   finish_decl (decl, input_location, NULL_TREE, NULL_TREE, NULL_TREE);
 }
 
Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c	(revision 209800)
+++ gcc/c/c-typeck.c	(working copy)
@@ -2730,6 +2730,12 @@ c_expr_sizeof_expr (location_t loc, stru
   else
 {
   bool expr_const_operands = true;
+  if (warn_sizeof_array_argument && TREE_CODE (expr.value) == PARM_DECL && PARM_DECL_ARRAY_P (expr.value))
+	{
+	  warning_at (loc, 0, "sizeof on array parameter %<%s%> shall return size of %qT",
+		  IDENTIFIER_POINTER (DECL_NAME (expr.value)), expr.original_type); 
+	  inform (DECL_SOURCE_LOCATION (expr.value), "declared here");
+	}
   tree folded_expr = c_fully_fold (expr.value, require_constant_value,
    &expr_const_operands);
   ret.value = c_sizeof (loc, TREE_TYPE (folded_expr));
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 209800)
+++ gcc/c-family/c.opt	(working copy)
@@ -509,6 +509,9 @@ Warn about missing fields in struct init
 Wsizeof-pointer-memaccess
 C ObjC C++ ObjC++ Var(warn_sizeof_pointer_memaccess) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 
+Wsizeof-array-argument
+C Var(warn_sizeof_array_argument) Warning LangEnabledBy(C,Wall)
+
 Wsuggest-attribute=format
 C ObjC C++ ObjC++ Var(warn_suggest_attribute_format) Warning
 Warn about functions which might be candidates for format attributes
Index: gcc/testsuite/gcc.dg/sizeof-array-argument.c
===
--- gcc/testsuite/gcc.dg/sizeof-array-argument.c	(revision 0)
+++ gcc/testsuite/gcc.dg/sizeof-array-argument.c	(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-W

Re: [DOC PATCH] Rewrite docs for inline asm

2014-04-26 Thread Gerald Pfeifer
The perfect is the enemy of the good.  

>From all I have seen and heard, this rewrite is a clear improvement 
over the status quo.  So I am going to review and approve it wearing
my doc maintainer hat, deferring to and relying on Andrew and Richard 
and their deep expertise on the technical side.

Please take my comments below into account for an updated patch, and 
once Andrew and Richard have signed of, this is then good to commit.

(Patches of this size really are tough to review.  That surely has
contributed to the delay getting this in the tree most significantly.
In fact, I fell asleep over my notebook thrice reviewing this last
evening, though traveling from US West coast to Central Europe without
sleep during the flight probably did not exactly help, either. :-)

On Fri, 25 Apr 2014, James Greenhalgh wrote:
>> +@menu
>> +* Basic Asm::  Inline assembler with no operands.
>> +* Extended Asm::   Inline assembler with operands.
>> +* Constraints::Constraints for asm operands

Should this be "Asm operands" based on the other references here
or, better @code{asm} in all cases here?

>> +* Explicit Reg Vars::  Defining variables residing in specified registers.

Should this be "Register" instead of just "Reg"?
And "Variables" instead of just "Vars"?

>> +* Size of an asm:: How GCC calculates the size of an asm block.

@code{asm} ?

>> +@subsection Basic Asm - Assembler Instructions with No Operands

--- instead of -

>> +To create headers compatible with ISO C, write @code{__asm__} instead of 
>> +@code{asm} (@pxref{Alternate Keywords}).

Here it's just ISO C, whereas in case of __inline__ you had ISO C90.
Would it make sense to just use ISO C throughout?

>> +By definition, a Basic @code{asm} statement is one with no operands. 
>> +@code{asm} statements that contain one or more colons (used to delineate 
>> +operands) are considered to be Extended (for example, @code{asm("int $3")} 
>> +is Basic, and @code{asm("int $3" : )} is Extended). @xref{Extended Asm}.

At this point the reader does not yet know the concept of those
colons, does she?  So that can be seen as a bit confusing.

>> +@subsubheading Parameters
>> +@emph{AssemblerInstructions}
>> +@*
>> +This is a literal string that specifies the assembler code. The string can 

Just "assembler code", without "the", would be my take, but let's see
what native speakers so.

>> +The GCC compiler does not parse the assembler instructions themselves and 

"GCC does".  GCC stands for GNU Compiler Collection, and GNU Compiler
Collection Compiler may be a bit confusing. ;-)

>> +You may place multiple assembler instructions together in a single 
>> @code{asm} 
>> +string, separated by the characters normally used in assembly code for the 
>> +system. 

Might "by the same characters normally...the target system" be more clear?

>> +A combination that works in most places is a newline to break the 
>> +line, plus a tab character to move to the instruction field (written 
>> +as "\n\t").

Will everyone know what an instruction field is?  I'm not sure it's
that common of a term.

>> +Do not expect a sequence of @code{asm} statements to remain perfectly 
>> +consecutive after compilation. If certain instructions need to remain 
>> +consecutive in the output, put them in a single multi-instruction asm 
>> +statement. Note that GCC's optimizer can move @code{asm} statements 
>> +relative to other code, including across jumps.

optimizers (plural)

>> +GCC's optimizer does not know about these jumps, and therefore cannot take 
>> +account of them when deciding how to optimize. Jumps from @code{asm} to C 
>> +labels are only supported in Extended @code{asm}.

How about just saying "GCC does not know..:"?

>> +@subsubheading Remarks
>> +Using Extended @code{asm} will typically produce smaller, safer, and more 
>> +efficient code, and in most cases it is a better solution.

Why?

>> When writing inline assembly language outside C functions, however, 
>> you must use Basic @code{asm}. Extended @code{asm} statements have to 
>> be inside a C function.

Might that be "outside of C functions"?  Native speakers?

The way this is phrased sounds a bit negative.  Is this really a
drawback?

>> +Under certain circumstances, GCC may duplicate (or remove duplicates of) 
>> your 
>> +asm code as part of optimization. This can lead to unexpected duplicate 
>> +symbol errors during compilation if your asm code defines symbols or labels.

"as part of optimization" -> "when optimizing"

@code{asm}

>> +Safely accessing C data and calling functions from Basic @code{asm} is more 
>> +complex than it may appear. To access C data, it is better to use Extended 
>> +@code{asm}.

Here we are, this looks like (part of) the answer to my question
"Why?" above. :-)

>> +Since GCC does not parse the AssemblerInstructions, it has no visibility of 
 ^
Something wrong here.

>> +any symbols it references. This may result in th

Re: [PATCH 01/89] Const-correctness fixes for some gimple accessors

2014-04-26 Thread Gerald Pfeifer
On Mon, 21 Apr 2014, David Malcolm wrote:
> It was pointed out to me off-list that this patch series lacks
> documentation changes.  I'm working on fixing that, though am not sure I
> want to fill everyone inboxes with an updated set of patches yet.
> Should I send a combined patch for the documentation changes?  (I can
> break it up and merge it into the individual changes on commit, or if
> these need editing).
> 
> In any case I fixed up the corresponding entries in gcc/doc/gimple.texi,
> double-checked the bootstrap/regtest/HTML generation of this one (on top
> of r209545), and committed it to trunk as r209548.  I'm attaching what I
> actually committed.

You certainly can consider any doc changes that mirror code changes
as pre-approved.

Based on what Jeff said, about changes like this one (const-correctness)
going in right away, perhaps get these off your table together with their
associated documentation aspects?  And then do one doc change for all the
rest -- or individual ones, I don't have a strong preference.

Thanks for working on this!

Gerald


Re: [Patch 2/3] Make regex's infinite loop detection accurate

2014-04-26 Thread Tim Shen
On Sat, Apr 26, 2014 at 1:00 PM, Jonathan Wakely  wrote:
> Maybe a dumb question (I don't understand a lot of the regex code!)
> but is it correct to set this to 1 in the case where
> __rep_count.first != _M_current ?  Could that result in the count
> going downwards from 2 to 1? (Maybe that's correct?)

As I said in the comment, __rep_count records how many times
(__rep_count.second) this node is visited *under certain input
iterator* (__rep_count.first).

That is to say, if the current iterator (_M_current) is not changing,
but the executor keeps visiting this node, times of visit needs to be
recorded. Once the _M_current changed, the count should be reset to 0.
We simply set it to 1 for the "set to 0 then increase by 1"
operations.

The idea behind this is that, for example, regex "(?:)*" will form a
self-epsilon-circle in the NFA graph. While it's executed, an infinite
loop may be entered because this circle doesn't consumes any input
(aka, the current iterator is not changing). So for given node, we
need to prevent from entering infinite loop by checking the iterator
since last visited. If the iterator didn't change, we shall refuse to
follow the edge-on-the-circle.

There's one more question: we shouldn't simply refuse to follow the
edge that last visited under the same iterator, because visiting some
node (begin capture, end capture) on the circle has side effects. To
make sure these side effects happen, one more time of reentering is
permitted. That is why we set the counter and permitted entering the
node for at most 2 times.

About the overall regex code, I think I can compose a blog entry to
explain it, if necessary?


-- 
Regards,
Tim Shen
commit b188832f9bd85c71ef70693859e721974163c24b
Author: tim 
Date:   Fri Apr 25 01:50:11 2014 -0400

2014-04-25  Tim Shen  

* include/bits/regex_executor.h: Add _M_rep_count to track how
many times this repeat node are visited.
* include/bits/regex_executor.tcc (_Executor<>::_M_rep_once_more,
_Executor<>::_M_dfs): Use _M_rep_count to prevent entering
infinite loop.

diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index 708c78e..c110b88 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -73,6 +73,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_results(__results),
   _M_match_queue(__dfs_mode ? nullptr
 : new vector>()),
+  _M_rep_count(_M_nfa.size()),
   _M_visited(__dfs_mode ? nullptr : new vector(_M_nfa.size())),
   _M_flags((__flags & regex_constants::match_prev_avail)
   ? (__flags
@@ -104,6 +105,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 private:
   template
void
+   _M_rep_once_more(_StateIdT);
+
+  template
+   void
_M_dfs(_StateIdT __start);
 
   template
@@ -149,9 +154,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _ResultsVec&  _M_results;
   // Used in BFS, saving states that need to be considered for the next
   // character.
-  std::unique_ptr>> _M_match_queue;
+  unique_ptr>>  _M_match_queue;
   // Used in BFS, indicating that which state is already visited.
-  std::unique_ptr> _M_visited;
+  vector>_M_rep_count;
+  unique_ptr>  _M_visited;
   _FlagT_M_flags;
   // To record current solution.
   _StateIdT _M_start_state;
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 68a5e04..0daf211 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -161,6 +161,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return false;
 }
 
+  // __rep_count records how many times (__rep_count.second)
+  // this node is visited under certain input iterator
+  // (__rep_count.first). This prevent the executor from entering
+  // infinite loop by refusing to continue when it's already been
+  // visited more than twice. It's `twice` instead of `once` because
+  // we need to spare one more time for potential group capture.
+  template
+  template
+void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
+_M_rep_once_more(_StateIdT __i)
+{
+  const auto& __state = _M_nfa[__i];
+  auto& __rep_count = _M_rep_count[__i];
+  if (__rep_count.second == 0 || __rep_count.first != _M_current)
+   {
+ auto __back = __rep_count;
+ __rep_count.first = _M_current;
+ __rep_count.second = 1;
+ _M_dfs<__match_mode>(__state._M_alt);
+ __rep_count = __back;
+   }
+  else
+   {
+ if (__rep_count.second < 2)
+   {
+ __rep_count.second++;
+ _M_dfs<__match_mode>(_