date:20170601

[testsuite]MIPS remove duplicate div-x test

2017-06-01 Thread Paul Hua

Hi,

There are duplicate testcase in gcc.target/mips dir.

div-5.c same as div-9.c.
div-6.c same as div-10.c.
div-7.c same as div-11.c.
div-8.c same as div-12.c.

Is this deliberate?

Otherwise, the attached patch fixing this.


Paul.

***ChangeLog***

2017-06-01Chenghua Xu 

Remove duplicate div-x testcase.
* gcc.target/mips/div-9.c: Delete.
* gcc.target/mips/div-10.c: Ditto.
* gcc.target/mips/div-11.c: Ditto.
* gcc.target/mips/div-12.c: Ditto.
diff --git a/gcc/testsuite/gcc.target/mips/div-10.c b/gcc/testsuite/gcc.target/mips/div-10.c
deleted file mode 100644
index fb8953d..000
--- a/gcc/testsuite/gcc.target/mips/div-10.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-options "-mgp64 (-mips16)" } */
-/* { dg-final { scan-assembler "\tdivu\t" } } */
-/* { dg-final { scan-assembler "\tmflo\t" } } */
-/* { dg-final { scan-assembler-not "\tmfhi\t" } } */
-
-typedef unsigned int SI __attribute__((mode(SI)));
-
-MIPS16 SI
-f (SI x, SI y)
-{
-  return x / y;
-}
diff --git a/gcc/testsuite/gcc.target/mips/div-11.c b/gcc/testsuite/gcc.target/mips/div-11.c
deleted file mode 100644
index ff12929..000
--- a/gcc/testsuite/gcc.target/mips/div-11.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-options "-mgp64 (-mips16)" } */
-/* { dg-final { scan-assembler "\tdiv\t" } } */
-/* { dg-final { scan-assembler-not "\tmflo\t" } } */
-/* { dg-final { scan-assembler "\tmfhi\t" } } */
-
-typedef int SI __attribute__((mode(SI)));
-
-MIPS16 SI
-f (SI x, SI y)
-{
-  return x % y;
-}
diff --git a/gcc/testsuite/gcc.target/mips/div-12.c b/gcc/testsuite/gcc.target/mips/div-12.c
deleted file mode 100644
index 57866ce..000
--- a/gcc/testsuite/gcc.target/mips/div-12.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-options "-mgp64 (-mips16)" } */
-/* { dg-final { scan-assembler "\tdivu\t" } } */
-/* { dg-final { scan-assembler-not "\tmflo\t" } } */
-/* { dg-final { scan-assembler "\tmfhi\t" } } */
-
-typedef unsigned int SI __attribute__((mode(SI)));
-
-MIPS16 SI
-f (SI x, SI y)
-{
-  return x % y;
-}
diff --git a/gcc/testsuite/gcc.target/mips/div-9.c b/gcc/testsuite/gcc.target/mips/div-9.c
deleted file mode 100644
index 294cc7f..000
--- a/gcc/testsuite/gcc.target/mips/div-9.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-options "-mgp64 (-mips16)" } */
-/* { dg-final { scan-assembler "\tdiv\t" } } */
-/* { dg-final { scan-assembler "\tmflo\t" } } */
-/* { dg-final { scan-assembler-not "\tmfhi\t" } } */
-
-typedef int SI __attribute__((mode(SI)));
-
-MIPS16 SI
-f (SI x, SI y)
-{
-  return x / y;
-}

Re: [PATCH v2, rs6000] Fold vector absolutes in GIMPLE

2017-06-01 Thread Richard Biener

On Wed, May 31, 2017 at 9:38 PM, Will Schmidt  wrote:
> Hi,
>
> Add support for early expansion of vector absolute built-ins.
>
> [V2] Per reviews and feedback, skip the early folding for
> integral types based on a check against TYPE_OVERFLOW_WRAPS(arg0).
>
> Added test variants to exercise the -fwrapv option during
> this folding.
>
> OK for trunk?  (bootstraps running, pending review).

Looks good to me now.

> [gcc]
>
> 2017-05-31  Will Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
> for early expansion of vector absolute builtins.
>
> [gcc/testsuite]
>
> 2017-05-31  Will Schmidt  
>
> * gcc.target/powerpc/fold-vec-abs-char.c: New.
> * gcc.target/powerpc/fold-vec-abs-floatdouble.c: New.
> * gcc.target/powerpc/fold-vec-abs-int.c: New.
> * gcc.target/powerpc/fold-vec-abs-longlong.c: New.
> * gcc.target/powerpc/fold-vec-abs-short.c: New.
> * gcc.target/powerpc/fold-vec-abs-char-fwrapv.c: New.
> * gcc.target/powerpc/fold-vec-abs-int-fwrapv.c: New.
> * gcc.target/powerpc/fold-vec-abs-longlong-fwrapv.c: New.
> * gcc.target/powerpc/fold-vec-abs-short-fwrapv.c: New.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index dac673c..46d281a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -17333,6 +17333,24 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gsi_replace (gsi, g, true);
> return true;
>}
> +/* flavors of vec_abs. */
> +case ALTIVEC_BUILTIN_ABS_V16QI:
> +case ALTIVEC_BUILTIN_ABS_V8HI:
> +case ALTIVEC_BUILTIN_ABS_V4SI:
> +case ALTIVEC_BUILTIN_ABS_V4SF:
> +case P8V_BUILTIN_ABS_V2DI:
> +case VSX_BUILTIN_XVABSDP:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   if ( INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE(arg0)))
> +   && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE(arg0
> + return false;
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g = gimple_build_assign (lhs, ABS_EXPR, arg0);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
>  default:
>break;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
> new file mode 100644
> index 000..739f06e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char-fwrapv.c
> @@ -0,0 +1,18 @@
> +/* Verify that overloaded built-ins for vec_abs with char
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec -O2 -fwrapv" } */
> +
> +#include 
> +
> +vector signed char
> +test2 (vector signed char x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +/* { dg-final { scan-assembler-times "vsububm" 1 } } */
> +/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
> new file mode 100644
> index 000..239c919
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-char.c
> @@ -0,0 +1,18 @@
> +/* Verify that overloaded built-ins for vec_abs with char
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec -O2" } */
> +
> +#include 
> +
> +vector signed char
> +test2 (vector signed char x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "vspltisw|vxor" 1 } } */
> +/* { dg-final { scan-assembler-times "vsububm" 1 } } */
> +/* { dg-final { scan-assembler-times "vmaxsb" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
> new file mode 100644
> index 000..1a08618
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-floatdouble.c
> @@ -0,0 +1,23 @@
> +/* Verify that overloaded built-ins for vec_abs with float and
> +   double inputs for VSX produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-mvsx -O2" } */
> +
> +#include 
> +
> +vector float
> +test1 (vector float x)
> +{
> +  return vec_abs (x);
> +}
> +
> +vector double
> +test2 (vector double x)
> +{
> +  return vec_abs (x);
> +}
> +
> +/* { dg-final { scan-assembler-times "xvabssp" 1 } } */
> +/* { dg-final { scan-assembler-times "xvabsdp" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> new file mode 100644
> index 000..34dead4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-abs-int-fwrapv.c
> @@ -0,0 +1,18 @@

Re: [PATCH, rs6000] Fold vector shifts in GIMPLE

2017-06-01 Thread Richard Biener

On Wed, May 31, 2017 at 10:01 PM, Will Schmidt
 wrote:
> Hi,
>
> Add support for early expansion of vector shifts.  Including
> vec_sl (shift left), vec_sr (shift right), vec_sra (shift
> right algebraic), vec_rl (rotate left).
> Part of this includes adding the vector shift right instructions to
> the list of those instructions having an unsigned second argument.
>
> The VSR (vector shift right) folding is a bit more complex than
> the others. This is due to requiring arg0 be unsigned for an algebraic
> shift before the gimple RSHIFT_EXPR assignment is built.

Jakub, do we sanitize that undefinedness of left shifts of negative values
and/or overflow of left shift of nonnegative values?

Will, how is that defined in the intrinsics operation?  It might need similar
treatment as the abs case.

[I'd rather make the negative left shift case implementation defined
given C and C++ standards
do not agree to 100% AFAIK]

Richard.

> [gcc]
>
> 2017-05-26  Will Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
> for early expansion of vector shifts (sl,sr,sra,rl).
> (builtin_function_type): Add vector shift right instructions
> to the unsigned argument list.
>
> [gcc/testsuite]
>
> 2017-05-26  Will Schmidt  
>
> * testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
> * testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
> * testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
> * testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 8adbc06..6ee0bfd 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gsi_replace (gsi, g, true);
> return true;
>}
> +/* Flavors of vec_rotate_left . */
> +case ALTIVEC_BUILTIN_VRLB:
> +case ALTIVEC_BUILTIN_VRLH:
> +case ALTIVEC_BUILTIN_VRLW:
> +case P8V_BUILTIN_VRLD:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   arg1 = gimple_call_arg (stmt, 1);
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
> +  /* Flavors of vector shift right algebraic.  vec_sra{b,h,w} -> 
> vsra{b,h,w}. */
> +case ALTIVEC_BUILTIN_VSRAB:
> +case ALTIVEC_BUILTIN_VSRAH:
> +case ALTIVEC_BUILTIN_VSRAW:
> +case P8V_BUILTIN_VSRAD:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   arg1 = gimple_call_arg (stmt, 1);
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
> +   /* Flavors of vector shift left.  builtin_altivec_vsl{b,h,w} -> 
> vsl{b,h,w}.  */
> +case ALTIVEC_BUILTIN_VSLB:
> +case ALTIVEC_BUILTIN_VSLH:
> +case ALTIVEC_BUILTIN_VSLW:
> +case P8V_BUILTIN_VSLD:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   arg1 = gimple_call_arg (stmt, 1);
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
> +/* Flavors of vector shift right. */
> +case ALTIVEC_BUILTIN_VSRB:
> +case ALTIVEC_BUILTIN_VSRH:
> +case ALTIVEC_BUILTIN_VSRW:
> +case P8V_BUILTIN_VSRD:
> +  {
> +   arg0 = gimple_call_arg (stmt, 0);
> +   arg1 = gimple_call_arg (stmt, 1);
> +   lhs = gimple_call_lhs (stmt);
> +   gimple *g;
> +   /* convert arg0 to unsigned */
> +   arg0 = convert(unsigned_type_for(TREE_TYPE(arg0)),arg0);
> +   tree arg0_uns = 
> create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(arg0)));
> +   g = gimple_build_assign(arg0_uns,arg0);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +   /* convert lhs to unsigned and do the shift.  */
> +   tree lhs_uns = 
> create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(lhs)));
> +   g = gimple_build_assign (lhs_uns, RSHIFT_EXPR, arg0_uns, arg1);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +   /* convert lhs back to a signed type for the return. */
> +   lhs_uns = convert(signed_type_for(TREE_TYPE(lhs)),lhs_uns);
> +   g = gimple_build_assign(lhs,lhs_uns);
> +   gimple_set_location (g, gimple_location (stmt));
> +   gsi_replace (gsi, g, true);
> +   return true;
> +  }
>  default:
>break;
>  }
> @@ -19128,6 +19198,14 @@ builtin_function_type (machine_mod

Re: [PATCH v2] Implement no_sanitize function attribute

2017-06-01 Thread Bernhard Reutner-Fischer

On 31 May 2017 14:25:09 CEST, "Martin Liška"  wrote:

>I've got written that on my TODO list. Will work on that some time in
>the stage1.

BTW.. May I ask you to put it below https://gcc.gnu.org/PR65534 (the tailcall 
resp. IPA-ICF thing :-)

Many TIA and cheers,

Re: Handle unpropagated assignments in SLP

2017-06-01 Thread Richard Biener

On Thu, Jun 1, 2017 at 8:45 AM, Richard Sandiford
 wrote:
> Some of the SVE patches extend SLP to predicated operations created by
> ifcvt.  However, ifcvt currently forces the mask into a temporary:
>
> mask = ifc_temp_var (TREE_TYPE (mask), mask, &gsi);
>
> and at the moment SLP doesn't handle simple assignments like:
>
>SSA_NAME = SSA_NAME
>SSA_NAME = 
>
> (It does of course handle:
>
>SSA_NAME = SSA_NAME op SSA_NAME
>SSA_NAME = SSA_NAME op )
>
> I realise copy propagation should usually ensure that these simple
> assignments don't occur, but normal loop vectorisation handles them
> just fine, and SLP does too once we get over the initial validity check.
> I thought this patch might be useful even if we decide that we don't want
> ifcvt to create a temporary mask in such cases.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Thanks,
> Richard
>
>
> 2017-06-01  Richard Sandiford  
>
> gcc/
> * tree-vect-slp.c (vect_build_slp_tree_1): Allow mixtures of SSA
> names and constants, without treating them as separate operations.
> Explicitly reject stores.
>
> gcc/testsuite/
> * gcc.dg/vect/slp-temp-1.c: New test.
> * gcc.dg/vect/slp-temp-2.c: Likewise.
> * gcc.dg/vect/slp-temp-3.c: Likewise.
>
> Index: gcc/tree-vect-slp.c
> ===
> --- gcc/tree-vect-slp.c 2017-05-18 07:51:12.387750673 +0100
> +++ gcc/tree-vect-slp.c 2017-06-01 07:21:44.094320070 +0100
> @@ -671,6 +671,13 @@ vect_build_slp_tree_1 (vec_info *vinfo,
>first_op1 = gimple_assign_rhs2 (stmt);
>  }
> }
> +  else if ((TREE_CODE_CLASS (rhs_code) == tcc_constant
> +   || rhs_code == SSA_NAME)
> +  && (TREE_CODE_CLASS (first_stmt_code) == tcc_constant
> +  || first_stmt_code == SSA_NAME))
> +   /* Merging two simple rvalues is OK and doesn't count as two
> +  operations.  */
> +   ;


But this doesn't help in the case one stmt has the copy/constant
propagated and one not ...

>else
> {
>   if (first_stmt_code != rhs_code
> @@ -800,11 +807,14 @@ vect_build_slp_tree_1 (vec_info *vinfo,
> }
>
>   /* Not memory operation.  */
> - if (TREE_CODE_CLASS (rhs_code) != tcc_binary
> - && TREE_CODE_CLASS (rhs_code) != tcc_unary
> - && TREE_CODE_CLASS (rhs_code) != tcc_expression
> - && TREE_CODE_CLASS (rhs_code) != tcc_comparison
> - && rhs_code != CALL_EXPR)
> + if (REFERENCE_CLASS_P (lhs)
> + || (TREE_CODE_CLASS (rhs_code) != tcc_binary
> + && TREE_CODE_CLASS (rhs_code) != tcc_unary
> + && TREE_CODE_CLASS (rhs_code) != tcc_expression
> + && TREE_CODE_CLASS (rhs_code) != tcc_comparison
> + && TREE_CODE_CLASS (rhs_code) != tcc_constant
> + && rhs_code != CALL_EXPR
> + && rhs_code != SSA_NAME))

I think this whole block is dead code ... we can't ever visit stores and the
only case the rhs_code == tcc_reference check above would miss is
plain decls (but non-indexed decls should be rejected by dataref analysis
already).

I think a better fix is to ensure we do not have non-copy/constant propagated
IL fed into the vectorizer.

Richard.

> {
>   if (dump_enabled_p ())
> {
> Index: gcc/testsuite/gcc.dg/vect/slp-temp-1.c
> ===
> --- /dev/null   2017-06-01 07:09:35.344016119 +0100
> +++ gcc/testsuite/gcc.dg/vect/slp-temp-1.c  2017-06-01 07:39:24.406603119 
> +0100
> @@ -0,0 +1,71 @@
> +/* { dg-do run { target { lp64 || ilp32 } } } */
> +/* { dg-additional-options "-fgimple -fno-tree-copy-prop" } */
> +
> +void __GIMPLE (startwith ("loop"))
> +f (int *x, int n)
> +{
> +  int i_in;
> +  int i_out;
> +  int double_i;
> +
> +  long unsigned int index_0;
> +  long unsigned int offset_0;
> +  int *addr_0;
> +  int temp_0;
> +
> +  long unsigned int index_1;
> +  long unsigned int offset_1;
> +  int *addr_1;
> +  int temp_1;
> +
> + entry:
> +  goto loop;
> +
> + loop:
> +  i_in = __PHI (entry: 0, latch: i_out);
> +  double_i = i_in * 2;
> +
> +  index_0 = (long unsigned int) double_i;
> +  offset_0 = index_0 * 4ul;
> +  addr_0 = x_1(D) + offset_0;
> +  temp_0 = 1;
> +  *addr_0 = temp_0;
> +
> +  index_1 = index_0 + 1ul;
> +  offset_1 = index_1 * 4ul;
> +  addr_1 = x_1(D) + offset_1;
> +  temp_1 = 3;
> +  *addr_1 = temp_1;
> +
> +  i_out = i_in + 1;
> +  if (n_2(D) > i_out)
> +goto latch;
> +  else
> +goto exit;
> +
> + latch:
> +  goto loop;
> +
> + exit:
> +  return;
> +}
> +
> +#define N 1024
> +
> +int
> +main (void)
> +{
> +  int a[N * 2];
> +  f (a, N);
> +  for (int i = 0; i < N; ++i)
> +{
> +  if (a[i * 2] != 1
> + || a[i * 2 + 1] != 3)
> +   __builtin_abort ();
> +

Re: [PATCH, rs6000] Fold vector shifts in GIMPLE

2017-06-01 Thread Jakub Jelinek

On Thu, Jun 01, 2017 at 09:48:37AM +0200, Richard Biener wrote:
> On Wed, May 31, 2017 at 10:01 PM, Will Schmidt
>  wrote:
> > Hi,
> >
> > Add support for early expansion of vector shifts.  Including
> > vec_sl (shift left), vec_sr (shift right), vec_sra (shift
> > right algebraic), vec_rl (rotate left).
> > Part of this includes adding the vector shift right instructions to
> > the list of those instructions having an unsigned second argument.
> >
> > The VSR (vector shift right) folding is a bit more complex than
> > the others. This is due to requiring arg0 be unsigned for an algebraic
> > shift before the gimple RSHIFT_EXPR assignment is built.
> 
> Jakub, do we sanitize that undefinedness of left shifts of negative values
> and/or overflow of left shift of nonnegative values?

We don't yet, see PR77823 - all I've managed to do before stage1 was over
was instrumentation of signed arithmetic integer overflow on vectors,
division, shift etc. are tasks maybe for this stage1.

That said, shift instrumentation in particular is done early because every
FE has different rules, and so if it is coming from target builtins that are
folded into something, it wouldn't be instrumented anyway.

Jakub

RE: [PATCH 6/7] [ARC] Prevent moving stores to the frame before the stack adjustment.

2017-06-01 Thread Claudiu Zissulescu

> Given the description the code looks fine.  It would be nice to see
> more of a _why_ in the commit message.  I'm guessing this is either
> something related to signal handling, or debugging... I don't see why
> this would be needed for functional correctness.
> 

The issue is how we generate a function prologue when we use 
fno-omit-frame-pointer, for example:

[snip]
mov_s   fp,sp   ; frame pointer is set here
[snip]
st  r1,[fp,-24] ; frame pointer is used here
[snip]
sub_s   sp,sp,0x20  ; stack pointer adjusted

So we can easily see that any interrupt between the `st` and `sub` instruction 
will lead to faulty code as the interrupt routine will use a faulty sp 
register, and, potentially, overwriting the value stored by 'st' instruction. 
Thus, adding a scheduler barrier will force the compiler to emit the `sub` 
instruction before the store one.

Thanks,
Claudiu

Re: [PATCH] DWARF: for variants, produce unsigned discr. when debug type is unsigned

2017-06-01 Thread Pierre-Marie de Rodat


On 05/31/2017 10:34 PM, Jason Merrill wrote:

OK.

Jason


Committed. Thank you!

--
Pierre-Marie de Rodat

Support $SYSROOT for = in -I etc.

2017-06-01 Thread Rainer Orth

GNU ld recently gained support for a $SYSROOT token in -L to denote the
sysroot prefix (similar to $ORIGIN in rpath), to be more intuitive than
the current use of '=' for that purpose:

PR ld/21251
Support $SYSROOT in ld -L and INPUT command
https://sourceware.org/bugzilla/show_bug.cgi?id=21251

gcc's -I etc. options currently also support '=' for that purpose only,
and it seems sensible to allow $SYSROOT here for the same reasons.

The following patch implements just that, bootstrapped on
i386-pc-solaris2.12 and sparc-sun-solaris2.12, only manually tested
since currently there are no --sysroot tests whatsoever in the
testsuite.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-05-31  Rainer Orth  

* incpath.c (add_sysroot_to_chain): Allow for $SYSROOT prefix.
* doc/cppdiropts.texi (-I @var{dir}): Document it.

# HG changeset patch
# Parent  3c4ef3ebcd067bdeea49ba77eab3ed9b650e0d58
Support $SYSROOT for = in -I etc.

diff --git a/gcc/doc/cppdiropts.texi b/gcc/doc/cppdiropts.texi
--- a/gcc/doc/cppdiropts.texi
+++ b/gcc/doc/cppdiropts.texi
@@ -22,8 +22,9 @@ for header files during preprocessing.
 @ifset cppmanual
 @xref{Search Path}.
 @end ifset
-If @var{dir} begins with @samp{=}, then the @samp{=} is replaced
-by the sysroot prefix; see @option{--sysroot} and @option{-isysroot}.
+If @var{dir} begins with @samp{=} or @code{$SYSROOT}, then the @samp{=}
+or @code{$SYSROOT} is replaced by the sysroot prefix; see
+@option{--sysroot} and @option{-isysroot}.
 
 Directories specified with @option{-iquote} apply only to the quote 
 form of the directive, @code{@w{#include "@var{file}"}}.
diff --git a/gcc/incpath.c b/gcc/incpath.c
--- a/gcc/incpath.c
+++ b/gcc/incpath.c
@@ -314,7 +314,7 @@ remove_duplicates (cpp_reader *pfile, st
 }
 
 /* Add SYSROOT to any user-supplied paths in CHAIN starting with
-   "=".  */
+   "=" or "$SYSROOT".  */
 
 static void
 add_sysroot_to_chain (const char *sysroot, int chain)
@@ -322,8 +322,15 @@ add_sysroot_to_chain (const char *sysroo
   struct cpp_dir *p;
 
   for (p = heads[chain]; p != NULL; p = p->next)
-if (p->name[0] == '=' && p->user_supplied_p)
-  p->name = concat (sysroot, p->name + 1, NULL);
+{
+  if (p->user_supplied_p)
+	{
+	  if (p->name[0] == '=')
+	p->name = concat (sysroot, p->name + 1, NULL);
+	  if (strncmp (p->name, "$SYSROOT", strlen ("$SYSROOT")) == 0)
+	p->name = concat (sysroot, p->name + strlen ("$SYSROOT"), NULL);
+	}
+}
 }
 
 /* Merge the four include chains together in the order quote, bracket,

[OBVIOUS][SPARC] Fix a couple of insn alternatives missing a type attribute

2017-06-01 Thread Jose E. Marchesi


Installed the patch below to trunk as obvious.  It fixes a couple of
insn alternatives missing a type attribute in sparc.md.

Salud!

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 248773)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2017-06-01  Jose E. Marchesi  
+
+   * config/sparc/sparc.md (*zero_extendsidi2_insn_sp64): Set insn
+   type for movstouw.
+   (*sign_extendsidi2_insn): Likewise for movstosw.
+
 2017-06-01  Pierre-Marie de Rodat  
 
* dwarf2out.c (get_discr_value): Call the get_debug_type hook on
Index: gcc/config/sparc/sparc.md
===
--- gcc/config/sparc/sparc.md   (revision 248773)
+++ gcc/config/sparc/sparc.md   (working copy)
@@ -3014,7 +3014,7 @@
srl\t%1, 0, %0
lduw\t%1, %0
movstouw\t%1, %0"
-  [(set_attr "type" "shift,load,*")
+  [(set_attr "type" "shift,load,vismv")
(set_attr "cpu_feature" "*,*,vis3")
(set_attr "v3pipe" "*,*,true")])
 
@@ -3329,7 +3329,7 @@
   sra\t%1, 0, %0
   ldsw\t%1, %0
   movstosw\t%1, %0"
-  [(set_attr "type" "shift,sload,*")
+  [(set_attr "type" "shift,sload,vismv")
(set_attr "us3load_type" "*,3cycle,*")
(set_attr "cpu_feature" "*,*,vis3")
(set_attr "v3pipe" "*,*,true")])

Re: [gcn][patch] Add -mgpu option and plumb in assembler/linker

2017-06-01 Thread Thomas Schwinge

Hi!

Sorry for the late reply.

On Fri, 28 Apr 2017 18:06:39 +0100, Andrew Stubbs  wrote:
> 3. Add -mgpu option and corresponding --with-gpu. I've deliberately used 
> "gpu" instead of "cpu" because I want offloading compilers to be able to 
> say "-mcpu=foo -foffload=-mgpu=bar", or even have the host compiler just 
> understand -mgpu and DTRT.

I'm not sure I understand your last statement, or the intentions behind
it.

How would the host compiler (be able to) understand (or, disambiguate)
"-mgpu=[...]" in the (default) case of several offloading targets having
been configured?  I think it holds that "-m[...]" etc. must/can always
only apply to the current target (or "host", in "offloading speak").

And then, I don't have any strong opinion, but I don't see why a new
"-mgpu" option is preferable to using the existing "-march" etc. in
"-foffload=[...]".  For example, you can already now do things like
(exemplary):

-march=x86_64 -foffload=-march=generic -foffload=nvptx-none=-march=cc_50 
-foffload=gcn=-march=carrizo
^ target-specific ^ offload-target-specific, unless overridden... ^ ... 
here... ^ ..., and here

Likewise for the new "--with-gpu=[...]" vs. the existing
"--with-arch=[...]", where again I would, unless there is a specific
reason (that I didn't understand here), default to using the existing
option names instead of introducing new ones.


Grüße
 Thomas


> commit 5058457b0fa07865b366832828e74a53e5bd2964
> Author: Andrew Stubbs 
> Date:   Fri Apr 28 14:37:25 2017 +0100
> 
> Add -mgpu
> 
> 2017-04-28  Andrew Stubbs  
> 
>   gcc/
>   * config.gcc (amdgcn): Remove --with-arch and --with-tune.
>   Add --with-gpu, and set default to "carrizo"
>   (add_defaults): Add "gpu".
>   * config/gcn/gcn-opts.h: New file.
>   * config/gcn/gcn.c (output_file_start): Switch to HSACO version
>   2 and auto-detection of GPU type (from -mcpu).
>   (gcn_arch, gcn_tune): Remove.
>   * config/gcn/gcn.h: Include gcn-opts.h.
>   (enum processor_type): Move to gcn-opts.h.
>   (LIBGCC_SPEC, ASM_SPEC, LINK_SPEC): Define.
>   (gcn_arch, gcn_tune): Remove.
>   (OPTION_DEFAULT_SPECS): Remove "arch" and "tune"; add "gpu".
>   * config/gcn/gcn.opt: Include gcn-opts.h.
>   (gpu_type): New Enum.
>   (mgpu): New option.
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 4a77b66..b1df533 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -3901,20 +3901,20 @@ case "${target}" in
>   ;;
>  
>   amdgcn-*-*)
> - supported_defaults="arch tune"
> + supported_defaults="gpu"
>  
> - for which in arch tune; do
> - eval "val=\$with_$which"
> - case ${val} in
> - "" | fiji)
> - # OK
> - ;;
> - *)
> - echo "Unknown cpu used in --with-$which=$val." 
> 1>&2
> - exit 1
> - ;;
> - esac
> - done
> + case "$with_gpu" in
> + "")
> + with_gpu=carrizo
> + ;;
> + carrizo | fiji)
> + # OK
> + ;;
> + *)
> + echo "Unknown gpu used in --with-gpu=$val." 1>&2
> + exit 1
> + ;;
> + esac
>   ;;
>  
>   hppa*-*-*)
> @@ -4646,7 +4646,7 @@ case ${target} in
>  esac
>  
>  t=
> -all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 
> tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt 
> synci tls lxc1-sxc1 madd4"
> +all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 
> tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt 
> synci tls lxc1-sxc1 madd4 gpu"
>  for option in $all_defaults
>  do
>   eval "val=\$with_"`echo $option | sed s/-/_/g`
> diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
> new file mode 100644
> index 000..d0586d6
> --- /dev/null
> +++ b/gcc/config/gcn/gcn-opts.h
> @@ -0,0 +1,27 @@
> +/* Copyright (C) 2016-2017 Free Software Foundation, Inc.
> +
> +   This file is free software; you can redistribute it and/or modify it under
> +   the terms of the GNU General Public License as published by the Free
> +   Software Foundation; either version 3 of the License, or (at your option)
> +   any later version.
> +
> +   This file is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +   for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */

[testsuite, committed] Require label_values for some test-cases

2017-06-01 Thread Tom de Vries


Hi,

this patch adds dg-require-effective-target label_values for some 
test-cases.


Committed as obvious.

Thanks,
- Tom
Require label_values for some test-cases

2017-06-01  Tom de Vries  

	* c-c++-common/pr43395.c: Add dg-require-effective-target label_values.
	* gcc.c-torture/compile/asmgoto-1.c: Same.
	* gcc.dg/2707-1.c: Same.
	* gcc.dg/pr38700.c: Same.
	* gcc.dg/pr70169.c: Same.
	* gcc.dg/pr80112.c: Same.
	* gcc.dg/torture/pr51071-2.c: Same.
	* gcc.dg/torture/pr51071.c: Same.
	* gcc.dg/tree-ssa/alias-34.c: Same.

---
 gcc/testsuite/c-c++-common/pr43395.c| 1 +
 gcc/testsuite/gcc.c-torture/compile/asmgoto-1.c | 2 ++
 gcc/testsuite/gcc.dg/2707-1.c   | 1 +
 gcc/testsuite/gcc.dg/pr38700.c  | 1 +
 gcc/testsuite/gcc.dg/pr70169.c  | 1 +
 gcc/testsuite/gcc.dg/pr80112.c  | 1 +
 gcc/testsuite/gcc.dg/torture/pr51071-2.c| 1 +
 gcc/testsuite/gcc.dg/torture/pr51071.c  | 1 +
 gcc/testsuite/gcc.dg/tree-ssa/alias-34.c| 1 +
 9 files changed, 10 insertions(+)

diff --git a/gcc/testsuite/c-c++-common/pr43395.c b/gcc/testsuite/c-c++-common/pr43395.c
index f672c8c..d060ae2 100644
--- a/gcc/testsuite/c-c++-common/pr43395.c
+++ b/gcc/testsuite/c-c++-common/pr43395.c
@@ -1,5 +1,6 @@
 /* PR c/43395 */
 /* { dg-do compile } */
+/* { dg-require-effective-target label_values } */
 
 void *
 foo (void)
diff --git a/gcc/testsuite/gcc.c-torture/compile/asmgoto-1.c b/gcc/testsuite/gcc.c-torture/compile/asmgoto-1.c
index cc34610..5029699 100644
--- a/gcc/testsuite/gcc.c-torture/compile/asmgoto-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/asmgoto-1.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target label_values } */
+
 void fn (void);
 
 void
diff --git a/gcc/testsuite/gcc.dg/2707-1.c b/gcc/testsuite/gcc.dg/2707-1.c
index 5328dfa..85a3315 100644
--- a/gcc/testsuite/gcc.dg/2707-1.c
+++ b/gcc/testsuite/gcc.dg/2707-1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -Wall" } */
+/* { dg-require-effective-target label_values } */
 
 extern void foo(void *here);
 extern inline void bar(void)
diff --git a/gcc/testsuite/gcc.dg/pr38700.c b/gcc/testsuite/gcc.dg/pr38700.c
index ebece7f..8b7cbc8 100644
--- a/gcc/testsuite/gcc.dg/pr38700.c
+++ b/gcc/testsuite/gcc.dg/pr38700.c
@@ -1,6 +1,7 @@
 /* PR c/38700 */
 /* { dg-do compile } */
 /* { dg-options "-O0" } */
+/* { dg-require-effective-target label_values } */
 
 int
 foo ()
diff --git a/gcc/testsuite/gcc.dg/pr70169.c b/gcc/testsuite/gcc.dg/pr70169.c
index 41381e7..56e72f3 100644
--- a/gcc/testsuite/gcc.dg/pr70169.c
+++ b/gcc/testsuite/gcc.dg/pr70169.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-strict-aliasing -fno-tree-dce" } */
 /* { dg-skip-if "Program and data reside in different address spaces" { "avr-*-*" } } */
+/* { dg-require-effective-target label_values } */
 
 int printf (const char *, ...); 
 
diff --git a/gcc/testsuite/gcc.dg/pr80112.c b/gcc/testsuite/gcc.dg/pr80112.c
index 7c78aae..8d6ef3c 100644
--- a/gcc/testsuite/gcc.dg/pr80112.c
+++ b/gcc/testsuite/gcc.dg/pr80112.c
@@ -1,6 +1,7 @@
 /* PR rtl-optimization/80112 */
 /* { dg-do compile } */
 /* { dg-options "-Os -fmodulo-sched" } */
+/* { dg-require-effective-target label_values } */
 
 void **a;
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr51071-2.c b/gcc/testsuite/gcc.dg/torture/pr51071-2.c
index ccf3d81..638b4b8 100644
--- a/gcc/testsuite/gcc.dg/torture/pr51071-2.c
+++ b/gcc/testsuite/gcc.dg/torture/pr51071-2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-fno-delete-null-pointer-checks" } */
+/* { dg-require-effective-target label_values } */
 
 __extension__ typedef __UINTPTR_TYPE__ uintptr_t;
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr51071.c b/gcc/testsuite/gcc.dg/torture/pr51071.c
index 99af958..ad83dcc 100644
--- a/gcc/testsuite/gcc.dg/torture/pr51071.c
+++ b/gcc/testsuite/gcc.dg/torture/pr51071.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target label_values } */
 
 void foo (void);
 void bar (void *);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-34.c b/gcc/testsuite/gcc.dg/tree-ssa/alias-34.c
index 5738fea..363d121 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/alias-34.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-34.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fno-strict-aliasing -fdump-tree-optimized" } */
+/* { dg-require-effective-target label_values } */
 
 void foo (int b)
 {

[PATCH] DWARF: add DW_AT_location for global decls with DECL_VALUE_EXPR

2017-06-01 Thread Pierre-Marie de Rodat

Hi,

In GNAT, we materialize renamings that cannot be described in standard
DWARF as synthetic variables that describe how to fetch the renamed
object.  Look for "___XR" in gcc/ada/exp_dbug.ads for more details about
this convention.

In order to have a location for these variables in the debug info (GDB
requires it not to discard the variable) but also to avoid allocating
runtime space for them, we make these variable hold a DECL_VALUE_EXPR
tree.  However, since GCC 7, the DWARF back-end no longer generates a
DW_AT_location attribute for those.  This patch is an attempt to restore
this attribute.

Bootstrapped and reg-tested on x86_64-linux.  Also, I have a ~150 bytes
increase in the size of cc1, cc1plus and gnat1 (each of these is ~200MB
large).  Ok to commit?  Thank you in advance!

gcc/

* dwarf2out.c (dwarf2out_late_global_decl): Add locations for
symbols that hold a DECL_VALUE_EXPR.

gcc/testsuite/

* debug12.adb, debug12.ads: New testcase.
---
 gcc/dwarf2out.c   | 5 +++--
 gcc/testsuite/gnat.dg/debug12.adb | 9 +
 gcc/testsuite/gnat.dg/debug12.ads | 8 
 3 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/debug12.adb
 create mode 100644 gcc/testsuite/gnat.dg/debug12.ads

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 5ff45eb4efd..013c902bc89 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -25526,9 +25526,10 @@ dwarf2out_late_global_decl (tree decl)
{
  /* We get called via the symtab code invoking late_global_decl
 for symbols that are optimized out.  Do not add locations
-for those.  */
+for those, except if they have a DECL_VALUE_EXPR, in which case
+they are relevant for debuggers.  */
  varpool_node *node = varpool_node::get (decl);
- if (! node || ! node->definition)
+ if ((! node || ! node->definition) && ! DECL_HAS_VALUE_EXPR_P (decl))
tree_add_const_value_attribute_for_decl (die, decl);
  else
add_location_or_const_value_attribute (die, decl, false);
diff --git a/gcc/testsuite/gnat.dg/debug12.adb 
b/gcc/testsuite/gnat.dg/debug12.adb
new file mode 100644
index 000..07175968703
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/debug12.adb
@@ -0,0 +1,9 @@
+--  { dg-options "-cargs -gdwarf-4 -fdebug-types-section -dA -margs" }
+--  { dg-final { scan-assembler-times "DW_AT_location" 4 } }
+
+package body Debug12 is
+   function Get_A2 return Boolean is
+   begin
+  return A2;
+   end Get_A2;
+end Debug12;
diff --git a/gcc/testsuite/gnat.dg/debug12.ads 
b/gcc/testsuite/gnat.dg/debug12.ads
new file mode 100644
index 000..dbc5896cc73
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/debug12.ads
@@ -0,0 +1,8 @@
+package Debug12 is
+   type Bit_Array is array (Positive range <>) of Boolean
+  with Pack;
+   A  : Bit_Array := (1 .. 10 => False);
+   A2 : Boolean renames A (2);
+
+   function Get_A2 return Boolean;
+end Debug12;
-- 
2.13.0

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-06-01 Thread Dominique d'Humières


> Le 31 mai 2017 à 21:03, Nicolas Koenig  a écrit :
> 
> Hello Dominique,
> 
> attached is the next try, this time without stupidities (I hope). Both test 
> cases you posted don't ICE anymore.
> 
> Ok for trunk?
> 
> Nicolas
> 

Preliminary tests look OK, full testing in progress.

Thanks,

Dominique

RE: [PATCH 6/7] [ARC] Prevent moving stores to the frame before the stack adjustment.

2017-06-01 Thread Claudiu Zissulescu

 
> Given the description the code looks fine.  It would be nice to see
> more of a _why_ in the commit message.  I'm guessing this is either
> something related to signal handling, or debugging... I don't see why
> this would be needed for functional correctness.

Committed with additional comments.

Thank you, Claudiu

RE: [PATCH 7/7] [ARC] Test against frame_pointer_needed in arc_can_eliminate.

2017-06-01 Thread Claudiu Zissulescu

 
> Looks good,
> 

Committed, thank you,
Claudiu

RE: [PATCH 5/7] [ARC] Update (non)commutative_binary_comparison patterns.

2017-06-01 Thread Claudiu Zissulescu

> Looks good, thanks,
> Andrew

Committed, thank you,
Claudiu

RE: [PATCH 4/7] [ARC] Change predicate movv2hi to avoid scaled addresses.

2017-06-01 Thread Claudiu Zissulescu

> Seems reasonable.
> 

Committed, thank you,
Claudiu

RE: [PATCH 3/7] [ARC] Allow r30 to be used by the reg-alloc.

2017-06-01 Thread Claudiu Zissulescu

> Looks good, thanks,
> 
Committed, thank you,
Claudiu

RE: [PATCH 2/7] [ARC] Avoid use of hard registers before reg-alloc.

2017-06-01 Thread Claudiu Zissulescu

> Looks good, thanks,

Committed, thank you,
Claudiu

RE: [PATCH 1/7] [ARC] Make mulsi for A700 pattern commutative.

2017-06-01 Thread Claudiu Zissulescu

> Looks good thanks,

Committed, thank you,
Claudiu

Re: [PATCH, GCC/ARM/gcc-7-branch] Backport PR71607

2017-06-01 Thread Prakhar Bahuguna

On 01/06/2017 07:15:47, Richard Sandiford wrote:
> Prakhar Bahuguna  writes:
> > On 31/05/2017 14:11:43, Richard Sandiford wrote:
> >> Prakhar Bahuguna  writes:
> >> > On 31/05/2017 09:19:40, Richard Sandiford wrote:
> >> >> const_ints are supposed to be stored in sign-extended form, so a 32-bit
> >> >> integer with the MSB set should be 0x8000|x instead of
> >> >> 0x8000|x.  It's a bug if you have one where that isn't true.
> >> >> 
> >> >> In the patch it looks like this could come from:
> >> >> ...these two splits, where the GEN_INTs should probably be:
> >> >> 
> >> >>   gen_int_mode (..., SImode);
> >> >> 
> >> >> instead.
> >> >
> >> > Hi Richard, thanks for the tip. Is there a test case that could produce 
> >> > an
> >> > incorrect result? I've attempted to create one using negative doubles and
> >> > floats but haven't succeeded.
> >> 
> >> Just to check, are you testing with --enable-checking=yes,rtl?
> >> 
> >> When the values you tried were split, did you get the sign-extended form
> >> or the zero-extended form?
> >> 
> >> Thanks,
> >> Richard
> >
> > I've now rebuilt with --enable-checking=yes,rtl and it appears that the 
> > split
> > values are being correctly sign-extended in the rtl and appear correctly in 
> > the
> > assembly.
> >
> > However, if you believe it is safer to use gen_int_mode(), I'll respin the
> > patch accordingly.
> 
> Yeah, I think it would be safer.  But if they were already correctly
> sign-extended, then what did you mean by:
> 
>   Also the pattern for splitting 32-bit immediates had to be changed, it
>   was not accepting unsigned 32-bit unsigned integers with the MSB
>   set. I believe const_int_operand expects the mode of the operand to be
>   set to VOIDmode and not SImode. I have only changed it in the patterns
>   that were affecting this code, though I suggest looking into changing
>   it in the rest of the ARM backend.
> 
> Thanks,
> Richard

This part of the patch was written by Andre. After checking with him, it seems
that some of the confusion arises from the comment on real_to_target() which
states "There are always 32 bits in each long, no matter the size of the host
long". While this may imply the value is zero-extended on hosts with wider
longs, it seems like the value is always correctly sign-extended and thus
gen_int_mode() should be unnecessary.

As for why VOIDmode is used with the values casted to int, there is a reason
for why it is done this way to get it working but this has been long-forgotten.
I only have the code and this message to rely on.

-- 

Prakhar Bahuguna

[Ada] Fix PR ada/80921

2017-06-01 Thread Eric Botcazou

We have apparently never tried to build shared libraries in cross builds.

Tested on x86_64-suse-linux, applied on mainline and 7 & 6 branches.


2017-06-01  Eric Botcazou  

PR ada/80921
* configure.ac (default_gnatlib_target): Remove bogus condition.
(have_getipinfo): Tweak.
* configure: Regenerate.

-- 
Eric BotcazouIndex: configure.ac
===
--- configure.ac	(revision 248552)
+++ configure.ac	(working copy)
@@ -127,9 +127,7 @@ AC_PROG_AWK
 AC_PROG_LN_S
 
 # Determine what to build for 'gnatlib'
-if test $build = $target \
-   && test ${enable_shared} = yes ; then
-  # Note that build=target is almost certainly the wrong test; FIXME
+if test ${enable_shared} = yes; then
   default_gnatlib_target="gnatlib-shared"
 else
   default_gnatlib_target="gnatlib-plain"
@@ -138,15 +136,16 @@ AC_SUBST([default_gnatlib_target])
 
 # Check for _Unwind_GetIPInfo
 GCC_CHECK_UNWIND_GETIPINFO
-have_getipinfo=
 if test x$have_unwind_getipinfo = xyes; then
   have_getipinfo=-DHAVE_GETIPINFO
+else
+  have_getipinfo=
 fi
-AC_SUBST(have_getipinfo)
+AC_SUBST([have_getipinfo])
 
 # Check for 
 AC_CHECK_HEADER([sys/capability.h], have_capability=-DHAVE_CAPABILITY, have_capability=)
-AC_SUBST(have_capability)
+AC_SUBST([have_capability])
 
 # Determine what GCC version number to use in filesystem paths.
 GCC_BASE_VER

[Committed] S/390: Don't fetch the return address early with ooo

2017-06-01 Thread Andreas Krebbel

We used to load the return address slot some time in advance.  This
helped on older machines to resolve the data dependencies in time.
However, it is pointless on out of order CPUs.  Disabled with that
patch.

Regression tested on s390x. Committed to mainline.

Bye,

-Andreas-

gcc/ChangeLog:

2017-06-01  Andreas Krebbel  

* config/s390/s390.c (s390_emit_epilogue): Disable early return
address fetch for z10 or later.
---
 gcc/config/s390/s390.c | 63 +-
 1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 7be22d9..eb94237 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -11410,38 +11410,39 @@ s390_emit_epilogue (bool sibcall)
gen_rtx_REG (Pmode, i), cfa_restores);
}
 
-  if (! sibcall)
-   {
- /* Fetch return address from stack before load multiple,
-this will do good for scheduling.
-
-Only do this if we already decided that r14 needs to be
-saved to a stack slot. (And not just because r14 happens to
-be in between two GPRs which need saving.)  Otherwise it
-would be difficult to take that decision back in
-s390_optimize_prologue.  */
- if (cfun_gpr_save_slot (RETURN_REGNUM) == SAVE_SLOT_STACK)
-   {
- int return_regnum = find_unused_clobbered_reg();
- if (!return_regnum)
-   return_regnum = 4;
- return_reg = gen_rtx_REG (Pmode, return_regnum);
-
- addr = plus_constant (Pmode, frame_pointer,
-   offset + cfun_frame_layout.gprs_offset
-   + (RETURN_REGNUM
-  - cfun_frame_layout.first_save_gpr_slot)
-   * UNITS_PER_LONG);
- addr = gen_rtx_MEM (Pmode, addr);
- set_mem_alias_set (addr, get_frame_alias_set ());
- emit_move_insn (return_reg, addr);
+  /* Fetch return address from stack before load multiple,
+this will do good for scheduling.
+
+Only do this if we already decided that r14 needs to be
+saved to a stack slot. (And not just because r14 happens to
+be in between two GPRs which need saving.)  Otherwise it
+would be difficult to take that decision back in
+s390_optimize_prologue.
+
+This optimization is only helpful on in-order machines.  */
+  if (! sibcall
+ && cfun_gpr_save_slot (RETURN_REGNUM) == SAVE_SLOT_STACK
+ && s390_tune <= PROCESSOR_2097_Z10)
+   {
+ int return_regnum = find_unused_clobbered_reg();
+ if (!return_regnum)
+   return_regnum = 4;
+ return_reg = gen_rtx_REG (Pmode, return_regnum);
+
+ addr = plus_constant (Pmode, frame_pointer,
+   offset + cfun_frame_layout.gprs_offset
+   + (RETURN_REGNUM
+  - cfun_frame_layout.first_save_gpr_slot)
+   * UNITS_PER_LONG);
+ addr = gen_rtx_MEM (Pmode, addr);
+ set_mem_alias_set (addr, get_frame_alias_set ());
+ emit_move_insn (return_reg, addr);
 
- /* Once we did that optimization we have to make sure
-s390_optimize_prologue does not try to remove the
-store of r14 since we will not be able to find the
-load issued here.  */
- cfun_frame_layout.save_return_addr_p = true;
-   }
+ /* Once we did that optimization we have to make sure
+s390_optimize_prologue does not try to remove the store
+of r14 since we will not be able to find the load issued
+here.  */
+ cfun_frame_layout.save_return_addr_p = true;
}
 
   insn = restore_gprs (frame_pointer,
-- 
2.9.1

[PR 80898] Propagate grp_write from disqualified SRA candidates

2017-06-01 Thread Martin Jambor

Hi,

when I wrote the lazy setting of grp_write flag early next year, I
made a mistake when thinking about what to do about SRA candidates
that were disqualified but form a RHS of an assignment link which was
to be used to set grp_write of the LHS when appropriate.  The code
expects that the RHS accesses form an access tree, but given that some
are rejected exactly because such a tree cannot be built, it does not
work.

The solution is to move dealing with disqualified RHSs to the
assignment link processing.  The patch below checks RHS and if it is
disqualified, marks the corresponding LHS as containing data.  As the
second testcase shows, that information must be then also propagated
downwards (this is not necessary in the normal propagation case
because there propagate_subaccesses_across_link will already do that
more elaborately) as well as upwards.

Bootstrapped and tested on x86_64-linux without any issues. OK for
trunk?

Thanks,

Martin


2017-06-01  Martin Jambor  

PR tree-optimization/80898
* tree-sra.c (process_subtree_disqualification): Removed.
(disqualify_candidate): Do not call
process_subtree_disqualification.
(subtree_mark_written_and_enqueue): New function.
(propagate_all_subaccesses): Set grp_write of LHS subtree if the
RHS has been disqualified and re-queue LHS if necessary.  Apart
from that, ignore disqualified RHS.

testsuite/
* gcc.dg/tree-ssa/pr80898.c: New test.
* gcc.dg/tree-ssa/pr80898-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c | 71 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr80898.c   | 20 +
 gcc/tree-sra.c| 56 +++-
 3 files changed, 126 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr80898.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
new file mode 100644
index 000..cb4799c5ced
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
@@ -0,0 +1,71 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+struct S0
+{
+  unsigned a : 15;
+  int b;
+  int c;
+};
+
+struct S1
+{
+  struct S0 s0;
+  int e;
+};
+
+struct Z
+{
+  char c;
+  int z;
+} __attribute__((packed));
+
+union U
+{
+  struct S1 s1;
+  struct Z z;
+};
+
+
+int __attribute__((noinline, noclone))
+return_zero (void)
+{
+  return 0;
+}
+
+volatile union U gu;
+struct S0 gs;
+
+int __attribute__((noinline, noclone))
+check_outcome ()
+{
+  if (gs.a != 6
+  || gs.b != 8)
+__builtin_abort ();
+}
+
+int
+main (int argc, char *argv[])
+{
+  union U u;
+  struct S1 m;
+  struct S0 l;
+
+  if (return_zero ())
+u.z.z = 2;
+  else
+{
+  u.s1.s0.a = 6;
+  u.s1.s0.b = 8;
+  u.s1.e = 2;
+
+  m = u.s1;
+  m.s0.c = 0;
+  l = m.s0;
+  gs = l;
+}
+
+  gu = u;
+  check_outcome ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c
new file mode 100644
index 000..ed88f2cbd1a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+struct S0 {
+  int f0 : 24;
+  int f1;
+  int f74;
+} a, *c = &a;
+struct S0 fn1() {
+  struct S0 b = {4, 3};
+  return b;
+}
+
+int main() {
+  *c = fn1();
+
+  if (a.f1 != 3)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 6a8a0a4a427..f25818f4481 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -694,21 +694,9 @@ static bool constant_decl_p (tree decl)
   return VAR_P (decl) && DECL_IN_CONSTANT_POOL (decl);
 }
 
-
-/* Mark LHS of assign links out of ACCESS and its children as written to.  */
-
-static void
-process_subtree_disqualification (struct access *access)
-{
-  struct access *child;
-  for (struct assign_link *link = access->first_link; link; link = link->next)
-link->lacc->grp_write = true;
-  for (child = access->first_child; child; child = child->next_sibling)
-process_subtree_disqualification (child);
-}
-
 /* Remove DECL from candidates for SRA and write REASON to the dump file if
there is one.  */
+
 static void
 disqualify_candidate (tree decl, const char *reason)
 {
@@ -723,13 +711,6 @@ disqualify_candidate (tree decl, const char *reason)
   print_generic_expr (dump_file, decl);
   fprintf (dump_file, " - %s\n", reason);
 }
-
-  struct access *access = get_first_repr_for_decl (decl);
-  while (access)
-{
-  process_subtree_disqualification (access);
-  access = access->next_grp;
-}
 }
 
 /* Return true iff the type contains a field or an element which does not allow
@@ -2679,6 +2660,26 @@ propagate_subaccesses_across_link (struct access *lacc, 
struct access *racc)
   return ret;
 }
 
+/* Beginning with ACCESS, traverse its whole access subtree and mark all
+   sub-trees a

Re: [PR 80898] Propagate grp_write from disqualified SRA candidates

2017-06-01 Thread Richard Biener

On Thu, 1 Jun 2017, Martin Jambor wrote:

> Hi,
> 
> when I wrote the lazy setting of grp_write flag early next year, I
> made a mistake when thinking about what to do about SRA candidates
> that were disqualified but form a RHS of an assignment link which was
> to be used to set grp_write of the LHS when appropriate.  The code
> expects that the RHS accesses form an access tree, but given that some
> are rejected exactly because such a tree cannot be built, it does not
> work.
> 
> The solution is to move dealing with disqualified RHSs to the
> assignment link processing.  The patch below checks RHS and if it is
> disqualified, marks the corresponding LHS as containing data.  As the
> second testcase shows, that information must be then also propagated
> downwards (this is not necessary in the normal propagation case
> because there propagate_subaccesses_across_link will already do that
> more elaborately) as well as upwards.
> 
> Bootstrapped and tested on x86_64-linux without any issues. OK for
> trunk?

Ok.

Thanks,
Richard.

> 
> Thanks,
> 
> Martin
> 
> 
> 2017-06-01  Martin Jambor  
> 
>   PR tree-optimization/80898
>   * tree-sra.c (process_subtree_disqualification): Removed.
>   (disqualify_candidate): Do not call
>   process_subtree_disqualification.
>   (subtree_mark_written_and_enqueue): New function.
>   (propagate_all_subaccesses): Set grp_write of LHS subtree if the
>   RHS has been disqualified and re-queue LHS if necessary.  Apart
>   from that, ignore disqualified RHS.
> 
> testsuite/
>   * gcc.dg/tree-ssa/pr80898.c: New test.
>   * gcc.dg/tree-ssa/pr80898-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c | 71 
> +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr80898.c   | 20 +
>  gcc/tree-sra.c| 56 +++-
>  3 files changed, 126 insertions(+), 21 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr80898.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
> new file mode 100644
> index 000..cb4799c5ced
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr80898-2.c
> @@ -0,0 +1,71 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +struct S0
> +{
> +  unsigned a : 15;
> +  int b;
> +  int c;
> +};
> +
> +struct S1
> +{
> +  struct S0 s0;
> +  int e;
> +};
> +
> +struct Z
> +{
> +  char c;
> +  int z;
> +} __attribute__((packed));
> +
> +union U
> +{
> +  struct S1 s1;
> +  struct Z z;
> +};
> +
> +
> +int __attribute__((noinline, noclone))
> +return_zero (void)
> +{
> +  return 0;
> +}
> +
> +volatile union U gu;
> +struct S0 gs;
> +
> +int __attribute__((noinline, noclone))
> +check_outcome ()
> +{
> +  if (gs.a != 6
> +  || gs.b != 8)
> +__builtin_abort ();
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  union U u;
> +  struct S1 m;
> +  struct S0 l;
> +
> +  if (return_zero ())
> +u.z.z = 2;
> +  else
> +{
> +  u.s1.s0.a = 6;
> +  u.s1.s0.b = 8;
> +  u.s1.e = 2;
> +
> +  m = u.s1;
> +  m.s0.c = 0;
> +  l = m.s0;
> +  gs = l;
> +}
> +
> +  gu = u;
> +  check_outcome ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c
> new file mode 100644
> index 000..ed88f2cbd1a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr80898.c
> @@ -0,0 +1,20 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +struct S0 {
> +  int f0 : 24;
> +  int f1;
> +  int f74;
> +} a, *c = &a;
> +struct S0 fn1() {
> +  struct S0 b = {4, 3};
> +  return b;
> +}
> +
> +int main() {
> +  *c = fn1();
> +
> +  if (a.f1 != 3)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 6a8a0a4a427..f25818f4481 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -694,21 +694,9 @@ static bool constant_decl_p (tree decl)
>return VAR_P (decl) && DECL_IN_CONSTANT_POOL (decl);
>  }
>  
> -
> -/* Mark LHS of assign links out of ACCESS and its children as written to.  */
> -
> -static void
> -process_subtree_disqualification (struct access *access)
> -{
> -  struct access *child;
> -  for (struct assign_link *link = access->first_link; link; link = 
> link->next)
> -link->lacc->grp_write = true;
> -  for (child = access->first_child; child; child = child->next_sibling)
> -process_subtree_disqualification (child);
> -}
> -
>  /* Remove DECL from candidates for SRA and write REASON to the dump file if
> there is one.  */
> +
>  static void
>  disqualify_candidate (tree decl, const char *reason)
>  {
> @@ -723,13 +711,6 @@ disqualify_candidate (tree decl, const char *reason)
>print_generic_expr (dump_file, decl);
>fprintf (dump_file, " - %s\n", reason);
>  }
> -
> -  struct access *access = get_first_repr_for_d

Re: Default std::vector default and move constructor

2017-06-01 Thread Jonathan Wakely


On 31/05/17 22:28 +0200, François Dumont wrote:
Unless I made a mistake it revealed that restoring explicit call to 
_Bit_alloc_type() in default constructor was not enough. G++ doesn't 
transform it into a value-init if needed. I don't know if it is a 
compiler bug but I had to do just like presented in the Standard to 
achieve the expected behavior.


That really shouldn't be necessary (see blow).

This value-init is specific to post-C++11 right ? Maybe I could remove 
the useless explicit call to _Bit_alloc_type() in pre-C++11 mode ?


No, because C++03 also requires the allocator to be value-initialized.


Now I wonder if I really introduced a regression in rb_tree...


Yes, I think you did. Could you try to verify that using the new
default_init_allocator?



+  struct _Bvector_impl
+   : public _Bit_alloc_type, public _Bvector_impl_data
+   {
+   public:
+#if __cplusplus >= 201103L
+ _Bvector_impl()
+   noexcept( noexcept(_Bit_alloc_type())
+ && noexcept(_Bvector_impl(declval())) )


This second condition is not needed, because that constructor should
be noexcept (see below).


+ : _Bvector_impl(_Bit_alloc_type())


This should not be necessary...


+ { }
+#else
  _Bvector_impl()
-   : _Bit_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
+ : _Bit_alloc_type()
  { }
+#endif


I would expect the constructor to look like this:

  _Bvector_impl()
  _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )
 : _Bit_alloc_type()
 { }

What happens when you do that?



  _Bvector_impl(const _Bit_alloc_type& __a)
-   : _Bit_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage()
+_GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type(__a)) )


Copying the allocator is not allowed to throw. You can use simply
_GLIBCXX_NOEXCEPT here.



+void test01()
+{
+  typedef default_init_allocator alloc_type;
+  typedef std::vector test_type;
+
+  test_type v1;
+  v1.push_back(T());
+
+  VERIFY( !v1.empty() );
+  VERIFY( !v1.get_allocator().state );


This is unlikely to ever fail, because the stack is probably full of
zeros anyway. Did you confirm whether the test fails without your
fixes to value-initialize the allocator?

One possible way to make it fail would be to construct the
vector using placement new, into a buffer filled with non-zero
values. (Valgrind or a sanitizer should also tell us, but we can't
rely on them in the testsuite).

Re: [v3] Fix cross compilation to Solaris

2017-06-01 Thread Jonathan Wakely


On 30/05/17 15:10 +0200, Rainer Orth wrote:

I recently tried a cross-build from sparc-sun-solaris2.12 to
i386-pc-solaris2.12 (with cross-binutils and gas, but the native ld
which has been a cross-linker for quite some time).  The build failed in
libstdc++-v3 like this:

/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc:62:46: error: 
‘void* aligned_alloc(std::size_t, std::size_t)’ conflicts with a previous 
declaration
aligned_alloc (std::size_t al, std::size_t sz)
 ^
In file included from /var/gcc/sysroot/i386/usr/include/stdlib.h:13:0,
from 
/var/gcc/cross/i386-pc-solaris2.12/obj/gcc-8.0.0-20170516/12-gcc-gas/i386-pc-solaris2.12/libstdc++-v3/include/cstdlib:75,
from 
/var/gcc/cross/i386-pc-solaris2.12/obj/gcc-8.0.0-20170516/12-gcc-gas/i386-pc-solaris2.12/libstdc++-v3/include/stdlib.h:36,
from 
/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc:27:
/var/gcc/sysroot/i386/usr/include/iso/stdlib_c99.h:79:14: note: previous 
declaration ‘void* std::aligned_alloc(std::size_t, std::size_t)’
extern void *aligned_alloc(size_t, size_t);
 ^
/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc: In function 
‘void* operator new(std::size_t, std::align_val_t)’:
/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc:103:57: error: call of 
overloaded ‘aligned_alloc(std::size_t&, std::size_t&)’ is ambiguous
  while (__builtin_expect ((p = aligned_alloc (align, sz)) == 0, false))
^
In file included from /var/gcc/sysroot/i386/usr/include/stdlib.h:13:0,
from 
/var/gcc/cross/i386-pc-solaris2.12/obj/gcc-8.0.0-20170516/12-gcc-gas/i386-pc-solaris2.12/libstdc++-v3/include/cstdlib:75,
from 
/var/gcc/cross/i386-pc-solaris2.12/obj/gcc-8.0.0-20170516/12-gcc-gas/i386-pc-solaris2.12/libstdc++-v3/include/stdlib.h:36,
from 
/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc:27:
/var/gcc/sysroot/i386/usr/include/iso/stdlib_c99.h:79:14: note: candidate: 
‘void* std::aligned_alloc(std::size_t, std::size_t)’
extern void *aligned_alloc(size_t, size_t);
 ^
/vol/gcc/src/hg/trunk/local/libstdc++-v3/libsupc++/new_opa.cc:62:1: note: 
candidate: ‘void* aligned_alloc(std::size_t, std::size_t)’
aligned_alloc (std::size_t al, std::size_t sz)
^
make[4]: *** [Makefile:936: new_opa.lo] Error 1

It turns out that currently crossconfig.m4 has a static list of
AC_DEFINEs for *-solaris*, which is obviously incomplete compared to the
target's features.  Instead of manually fixing this, it seems the way to
go is follow the lead of the Linux etc. targets and just perform the
link tests which are skipped in configure.ac for the !GLIBCXX_IS_NATIVE
case since they *do* work reliably in this case.  This is just what this
patch does: treat Solaris like the Linux targets and remove the
hardcoded feature list from crossconfig.m4.

This way, config.h is identical between a native build and the cross
above, with the exception of HAVE_SETENV which is equally guarded with
GLIBCXX_IS_NATIVE in acinclude.m4 (GLIBCXX_CONFIGURE_TESTSUITE).  Maybe
it's time to somehow refine the GLIBCXX_IS_NATIVE check to allow cross
configurations that *can* perform link tests to run them?


Sounds like a good idea, although I don't know how to do that.



With the patch, the cross-build succeeded without issues.

Ok for mainline?


OK, thanks.

[PATCH 2/7] [ARC] Define ADDITIONAL_REGISTER_NAMES.

2017-06-01 Thread Claudiu Zissulescu

This macro is needed to be used with -ffixed- option, and inline asm.

gcc/
2017-01-09  Claudiu Zissulescu  

* config/arc/arc.h (ADDITIONAL_REGISTER_NAMES): Define.
---
 gcc/config/arc/arc.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 16d5319..585e98c 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -1262,6 +1262,13 @@ extern char rname56[], rname57[], rname58[], rname59[];
   "lp_start", "lp_end" \
 }
 
+#define ADDITIONAL_REGISTER_NAMES  \
+{  \
+  {"ilink",  29},  \
+  {"r29",29},  \
+  {"r30",30}   \
+}
+
 /* Entry to the insn conditionalizer.  */
 #define FINAL_PRESCAN_INSN(INSN, OPVEC, NOPERANDS) \
   arc_final_prescan_insn (INSN, OPVEC, NOPERANDS)
-- 
1.9.1

[PATCH 0/7] [ARC] Bug fixing, add support for naked functions

2017-06-01 Thread Claudiu Zissulescu

From: claziss 

Hi,

The first patch adds support for 'naked' functions to ARC compiler, a
number of tests are also added.

The second patch defines additional register names aliases to be
handled by inline asm code or -ffixed register commnds.

The following two patches are in the context of improving LRA support
for ARC backend. Thus, one patch is cleaning out ARC tests, and the
second one remove the use of subregs during expand, which was a bad
idea anyway.

The next patch is enabling indexed loads and automodify ld/st
instructions to elf target. These options were already in, but I made
them default for elf target for the time being.

The PIC patch is refactoring the PIC implementation as a consequence
of increasingly seeing errors due to old implementation. New tests are
added as well.

The last patch deprecates the mexpand-addi option, this can be seen
also in the context of LRA where emitting the subregs during expand
is not a very good thing to do.

Please let me know if you have any question.

Thank you,
Claudiu

Claudiu Zissulescu (6):
  [ARC] Add support for naked functions.
  [ARC] Define ADDITIONAL_REGISTER_NAMES.
  [ARC] [LRA] Fix tests asm constraints.
  [ARC] [LRA] Avoid emitting COND_EXEC during expand.
  [ARC] Enable indexed loads for elf targers.
  [ARC] Consolidate PIC implementation.

claziss (1):
  [ARC] Deprecate mexpand-adddi option.

 gcc/config/arc/arc-protos.h  |   8 +-
 gcc/config/arc/arc.c | 315 ---
 gcc/config/arc/arc.h |  58 +++--
 gcc/config/arc/arc.md|  90 +++
 gcc/config/arc/arc.opt   |   6 +-
 gcc/config/arc/constraints.md|   6 +-
 gcc/config/arc/elf.h |   8 +
 gcc/config/arc/linux.h   |   8 +
 gcc/doc/invoke.texi  |   2 +-
 gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c |   2 +-
 gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c |   2 +-
 gcc/testsuite/gcc.target/arc/naked-1.c   |  18 ++
 gcc/testsuite/gcc.target/arc/naked-2.c   |  26 ++
 gcc/testsuite/gcc.target/arc/pic-1.c |  11 +
 gcc/testsuite/gcc.target/arc/pr9000674901.c  |  58 +
 gcc/testsuite/gcc.target/arc/pr9001191897.c  |  10 +
 16 files changed, 397 insertions(+), 231 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/naked-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/naked-2.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pic-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9000674901.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9001191897.c

-- 
1.9.1

[PATCH 1/7] [ARC] Add support for naked functions.

2017-06-01 Thread Claudiu Zissulescu

gcc/
2016-12-13  Claudiu Zissulescu  
Andrew Burgess  

* config/arc/arc-protos.h (arc_compute_function_type): Change prototype.
(arc_return_address_register): New function.
* config/arc/arc.c (arc_handle_fndecl_attribute): New function.
(arc_handle_fndecl_attribute): Add naked attribute.
(TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS): Define.
(TARGET_WARN_FUNC_RETURN): Likewise.
(arc_allocate_stack_slots_for_args): New function.
(arc_warn_func_return): Likewise.
(machine_function): Change type fn_type.
(arc_compute_function_type): Consider new naked function type,
change function return type.
(arc_must_save_register): Adapt to handle new
arc_compute_function_type's return type.
(arc_expand_prologue): Likewise.
(arc_expand_epilogue): Likewise.
(arc_return_address_regs): Delete.
(arc_return_address_register): New function.
(arc_epilogue_uses): Use above function.
* config/arc/arc.h (arc_return_address_regs): Delete prototype.
(arc_function_type): Change encoding, add naked type.
(ARC_INTERRUPT_P): Change to handle the new encoding.
(ARC_FAST_INTERRUPT_P): Likewise.
(ARC_NORMAL_P): Define.
(ARC_NAKED_P): Likewise.
(arc_compute_function_type): Delete prototype.
* config/arc/arc.md (in_ret_delay_slot): Use
arc_return_address_register function.
(simple_return): Likewise.
(p_return_i): Likewise.

gcc/testsuite
2016-12-13  Claudiu Zissulescu  
Andrew Burgess  

* gcc.target/arc/naked-1.c: New file.
* gcc.target/arc/naked-2.c: Likewise.
---
 gcc/config/arc/arc-protos.h|   6 +-
 gcc/config/arc/arc.c   | 165 -
 gcc/config/arc/arc.h   |  40 +---
 gcc/config/arc/arc.md  |  10 +-
 gcc/testsuite/gcc.target/arc/naked-1.c |  18 
 gcc/testsuite/gcc.target/arc/naked-2.c |  26 ++
 6 files changed, 197 insertions(+), 68 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/naked-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/naked-2.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index 4ff8e9b..b436dbe 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -45,12 +45,10 @@ extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, 
rtx, rtx, rtx);
 extern void arc_split_compare_and_swap (rtx *);
 extern void arc_expand_compare_and_swap (rtx *);
 extern bool compact_memory_operand_p (rtx, machine_mode, bool, bool);
+extern int arc_return_address_register (unsigned int);
+extern unsigned int arc_compute_function_type (struct function *);
 #endif /* RTX_CODE */
 
-#ifdef TREE_CODE
-extern enum arc_function_type arc_compute_function_type (struct function *);
-#endif /* TREE_CODE */
-
 extern bool arc_ccfsm_branch_deleted_p (void);
 extern void arc_ccfsm_record_branch_deleted (void);
 
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index a65fc3a..7dfc68e 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -211,6 +211,7 @@ static int rgf_banked_register_count;
 static int get_arc_condition_code (rtx);
 
 static tree arc_handle_interrupt_attribute (tree *, tree, tree, int, bool *);
+static tree arc_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
 
 /* Initialized arc_attribute_table to NULL since arc doesnot have any
machine specific supported attributes.  */
@@ -229,6 +230,9 @@ const struct attribute_spec arc_attribute_table[] =
   /* And these functions are always known to reside within the 21 bit
  addressing range of blcc.  */
   { "short_call",   0, 0, false, true,  true,  NULL, false },
+  /* Function which are not having the prologue and epilogue generated
+ by the compiler.  */
+  { "naked", 0, 0, true, false, false, arc_handle_fndecl_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 static int arc_comp_type_attributes (const_tree, const_tree);
@@ -513,6 +517,12 @@ static void arc_finalize_pic (void);
 #define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P hook_bool_void_true
 #define TARGET_SPILL_CLASS arc_spill_class
 
+#undef TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS
+#define TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS arc_allocate_stack_slots_for_args
+
+#undef TARGET_WARN_FUNC_RETURN
+#define TARGET_WARN_FUNC_RETURN arc_warn_func_return
+
 #include "target-def.h"
 
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -1859,6 +1869,42 @@ arc_handle_interrupt_attribute (tree *, tree name, tree 
args, int,
   return NULL_TREE;
 }
 
+static tree
+arc_handle_fndecl_attribute (tree *node, tree name, tree args ATTRIBUTE_UNUSED,
+int flags ATTRIBUTE_UNUSED, bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute only applies to functions",
+  name);
+  *no_add_attrs

[PATCH 3/7] [ARC] [LRA] Fix tests asm constraints.

2017-06-01 Thread Claudiu Zissulescu

LRA doesn't like the 'X' constraint as used in our tests, remove it.

gcc/testsuite
2017-01-09  Claudiu Zissulescu  

* gcc.target/arc/mulsi3_highpart-1.c: Remove 'X' constraint.
* gcc.target/arc/mulsi3_highpart-2.c: Likewise.
---
 gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c | 2 +-
 gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c 
b/gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c
index 57cb95b..5fd6c36 100644
--- a/gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c
+++ b/gcc/testsuite/gcc.target/arc/mulsi3_highpart-1.c
@@ -7,7 +7,7 @@
 static int
 id (int i)
 {
-  asm ("": "+Xr" (i));
+  asm ("": "+r" (i));
   return i;
 }
 
diff --git a/gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c 
b/gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c
index 287d96d..6ec4bc5 100644
--- a/gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c
+++ b/gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c
@@ -9,7 +9,7 @@
 static int
 id (int i)
 {
-  asm ("": "+Xr" (i));
+  asm ("": "+r" (i));
   return i;
 }
 
-- 
1.9.1

[PATCH 4/7] [ARC] [LRA] Avoid emitting COND_EXEC during expand.

2017-06-01 Thread Claudiu Zissulescu

Emmitting COND_EXEC rtxes during expand does not always work.

gcc/
2017-01-10  Claudiu Zissulescu  

* config/arc/arc.md (clzsi2): Expand to an arc_clzsi2 instruction
that also clobbers the CC register. The old expand code is moved
to ...
(*arc_clzsi2): ... here.
(ctzsi2): Expand to an arc_ctzsi2 instruction that also clobbers
the CC register. The old expand code is moved to ...
(arc_ctzsi2): ... here.
---
 gcc/config/arc/arc.md | 41 ++---
 1 file changed, 34 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 39bcc26..928feb1 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -4533,9 +4533,21 @@
(set_attr "type" "two_cycle_core,two_cycle_core")])
 
 (define_expand "clzsi2"
-  [(set (match_operand:SI 0 "dest_reg_operand" "")
-   (clz:SI (match_operand:SI 1 "register_operand" "")))]
+  [(parallel
+[(set (match_operand:SI 0 "register_operand" "")
+ (clz:SI (match_operand:SI 1 "register_operand" "")))
+ (clobber (match_dup 2))])]
+  "TARGET_NORM"
+  "operands[2] = gen_rtx_REG (CC_ZNmode, CC_REG);")
+
+(define_insn_and_split "*arc_clzsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (clz:SI (match_operand:SI 1 "register_operand" "r")))
+   (clobber (reg:CC_ZN CC_REG))]
   "TARGET_NORM"
+  "#"
+  "reload_completed"
+  [(const_int 0)]
 {
   emit_insn (gen_norm_f (operands[0], operands[1]));
   emit_insn
@@ -4552,9 +4564,23 @@
 })
 
 (define_expand "ctzsi2"
-  [(set (match_operand:SI 0 "register_operand" "")
-   (ctz:SI (match_operand:SI 1 "register_operand" "")))]
+  [(match_operand:SI 0 "register_operand" "")
+   (match_operand:SI 1 "register_operand" "")]
   "TARGET_NORM"
+  "
+  emit_insn (gen_arc_ctzsi2 (operands[0], operands[1]));
+  DONE;
+")
+
+(define_insn_and_split "arc_ctzsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (ctz:SI (match_operand:SI 1 "register_operand" "r")))
+   (clobber (reg:CC_ZN CC_REG))
+   (clobber (match_scratch:SI 2 "=&r"))]
+  "TARGET_NORM"
+  "#"
+  "reload_completed"
+  [(const_int 0)]
 {
   rtx temp = operands[0];
 
@@ -4562,10 +4588,10 @@
   || (REGNO (temp) < FIRST_PSEUDO_REGISTER
  && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS],
 REGNO (temp
-temp = gen_reg_rtx (SImode);
+temp = operands[2];
   emit_insn (gen_addsi3 (temp, operands[1], constm1_rtx));
   emit_insn (gen_bic_f_zn (temp, temp, operands[1]));
-  emit_insn (gen_clrsbsi2 (temp, temp));
+  emit_insn (gen_clrsbsi2 (operands[0], temp));
   emit_insn
 (gen_rtx_COND_EXEC
   (VOIDmode,
@@ -4575,7 +4601,8 @@
 (gen_rtx_COND_EXEC
   (VOIDmode,
gen_rtx_GE (VOIDmode, gen_rtx_REG (CC_ZNmode, CC_REG), const0_rtx),
-   gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31), temp;
+   gen_rtx_SET (operands[0], gen_rtx_MINUS (SImode, GEN_INT (31),
+   operands[0];
   DONE;
 })
 
-- 
1.9.1

[PATCH 6/7] [ARC] Deprecate mexpand-adddi option.

2017-06-01 Thread Claudiu Zissulescu

From: claziss 

Emitting subregs in the expand is not a good idea. Deprecate this
option.

gcc/
2017-04-26  Claudiu Zissulescu  

* config/arc/arc.md (adddi3): Remove support for mexpand-adddi
option.
(subdi3): Likewise.
* config/arc/arc.opt (mexpand-adddi): Deprecate it.
* doc/invoke.texi (mexpand-adddi): Update text.
---
 gcc/config/arc/arc.md  | 39 +--
 gcc/config/arc/arc.opt |  2 +-
 gcc/doc/invoke.texi|  2 +-
 3 files changed, 3 insertions(+), 40 deletions(-)

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 928feb1..f595da7 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -2649,30 +2649,7 @@
(match_operand:DI 2 "nonmemory_operand" "")))
  (clobber (reg:CC CC_REG))])]
   ""
-{
-  if (TARGET_EXPAND_ADDDI)
-{
-  rtx l0 = gen_lowpart (SImode, operands[0]);
-  rtx h0 = disi_highpart (operands[0]);
-  rtx l1 = gen_lowpart (SImode, operands[1]);
-  rtx h1 = disi_highpart (operands[1]);
-  rtx l2 = gen_lowpart (SImode, operands[2]);
-  rtx h2 = disi_highpart (operands[2]);
-  rtx cc_c = gen_rtx_REG (CC_Cmode, CC_REG);
-
-  if (CONST_INT_P (h2) && INTVAL (h2) < 0 && SIGNED_INT12 (INTVAL (h2)))
-   {
- emit_insn (gen_sub_f (l0, l1, gen_int_mode (-INTVAL (l2), SImode)));
- emit_insn (gen_sbc (h0, h1,
- gen_int_mode (-INTVAL (h2) - (l1 != 0), SImode),
- cc_c));
- DONE;
-   }
-  emit_insn (gen_add_f (l0, l1, l2));
-  emit_insn (gen_adc (h0, h1, h2));
-  DONE;
-}
-})
+{})
 
 ; This assumes that there can be no strictly partial overlap between
 ; operands[1] and operands[2].
@@ -2911,20 +2888,6 @@
 {
   if (!register_operand (operands[2], DImode))
 operands[1] = force_reg (DImode, operands[1]);
-  if (TARGET_EXPAND_ADDDI)
-{
-  rtx l0 = gen_lowpart (SImode, operands[0]);
-  rtx h0 = disi_highpart (operands[0]);
-  rtx l1 = gen_lowpart (SImode, operands[1]);
-  rtx h1 = disi_highpart (operands[1]);
-  rtx l2 = gen_lowpart (SImode, operands[2]);
-  rtx h2 = disi_highpart (operands[2]);
-  rtx cc_c = gen_rtx_REG (CC_Cmode, CC_REG);
-
-  emit_insn (gen_sub_f (l0, l1, l2));
-  emit_insn (gen_sbc (h0, h1, h2, cc_c));
-  DONE;
-}
 })
 
 (define_insn_and_split "subdi3_i"
diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
index ed2b827..ad2df26 100644
--- a/gcc/config/arc/arc.opt
+++ b/gcc/config/arc/arc.opt
@@ -328,7 +328,7 @@ Target Var(TARGET_Q_CLASS)
 Enable 'q' instruction alternatives.
 
 mexpand-adddi
-Target Var(TARGET_EXPAND_ADDDI)
+Target Warn(%qs is deprecated)
 Expand adddi3 and subdi3 at rtl generation time into add.f / adc etc.
 
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 59563aa..b6cf4ce 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14823,7 +14823,7 @@ Enable pre-reload use of the @code{cbranchsi} pattern.
 @item -mexpand-adddi
 @opindex mexpand-adddi
 Expand @code{adddi3} and @code{subdi3} at RTL generation time into
-@code{add.f}, @code{adc} etc.
+@code{add.f}, @code{adc} etc.  This option is deprecated.
 
 @item -mindexed-loads
 @opindex mindexed-loads
-- 
1.9.1

[PATCH 5/7] [ARC] Enable indexed loads for elf targers.

2017-06-01 Thread Claudiu Zissulescu

gcc/
2017-02-28  Claudiu Zissulescu  

* config/arc/arc.opt (mindexed-loads): Use initial value
TARGET_INDEXED_LOADS_DEFAULT.
(mauto-modify-reg): Use initial value
TARGET_AUTO_MODIFY_REG_DEFAULT.
* config/arc/elf.h (TARGET_INDEXED_LOADS_DEFAULT): Define.
(TARGET_AUTO_MODIFY_REG_DEFAULT): Likewise.
* config/arc/linux.h (TARGET_INDEXED_LOADS_DEFAULT): Define.
(TARGET_AUTO_MODIFY_REG_DEFAULT): Likewise.
---
 gcc/config/arc/arc.opt | 4 ++--
 gcc/config/arc/elf.h   | 8 
 gcc/config/arc/linux.h | 8 
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
index f01a2ff..ed2b827 100644
--- a/gcc/config/arc/arc.opt
+++ b/gcc/config/arc/arc.opt
@@ -270,11 +270,11 @@ Target RejectNegative Var(arc_tune, TUNE_ARC700_4_2_XMAC)
 Tune for ARC700 R4.2 Cpu with XMAC block.
 
 mindexed-loads
-Target Var(TARGET_INDEXED_LOADS)
+Target Var(TARGET_INDEXED_LOADS) Init(TARGET_INDEXED_LOADS_DEFAULT)
 Enable the use of indexed loads.
 
 mauto-modify-reg
-Target Var(TARGET_AUTO_MODIFY_REG)
+Target Var(TARGET_AUTO_MODIFY_REG) Init(TARGET_AUTO_MODIFY_REG_DEFAULT)
 Enable the use of pre/post modify with register displacement.
 
 mmul32x16
diff --git a/gcc/config/arc/elf.h b/gcc/config/arc/elf.h
index c5794f8..43f3408 100644
--- a/gcc/config/arc/elf.h
+++ b/gcc/config/arc/elf.h
@@ -58,3 +58,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Bare-metal toolchains do not need a thread pointer register.  */
 #undef TARGET_ARC_TP_REGNO_DEFAULT
 #define TARGET_ARC_TP_REGNO_DEFAULT -1
+
+/* Indexed loads are default.  */
+#undef TARGET_INDEXED_LOADS_DEFAULT
+#define TARGET_INDEXED_LOADS_DEFAULT 1
+
+/* Pre/post modify with register displacement are default.  */
+#undef TARGET_AUTO_MODIFY_REG_DEFAULT
+#define TARGET_AUTO_MODIFY_REG_DEFAULT 1
diff --git a/gcc/config/arc/linux.h b/gcc/config/arc/linux.h
index 83e5a1d..d8e0063 100644
--- a/gcc/config/arc/linux.h
+++ b/gcc/config/arc/linux.h
@@ -83,3 +83,11 @@ along with GCC; see the file COPYING3.  If not see
 #define SUBTARGET_CPP_SPEC "\
%{pthread:-D_REENTRANT} \
 "
+
+/* Indexed loads are default off.  */
+#undef TARGET_INDEXED_LOADS_DEFAULT
+#define TARGET_INDEXED_LOADS_DEFAULT 0
+
+/* Pre/post modify with register displacement are default off.  */
+#undef TARGET_AUTO_MODIFY_REG_DEFAULT
+#define TARGET_AUTO_MODIFY_REG_DEFAULT 0
-- 
1.9.1

[PATCH 7/7] [ARC] Consolidate PIC implementation.

2017-06-01 Thread Claudiu Zissulescu

This patch refactors a number of functions and compiler hooks into using a
single function which checks if a rtx is suited for pic or not. Removed
functions are arc_legitimate_pc_offset_p and arc_legitimate_pic_operand_p
beeing replaced by calls to arc_legitimate_pic_addr_p. Thus we have an
unitary way of checking a rtx beeing pic.

gcc/
2017-02-24  Claudiu Zissulescu  

* config/arc/arc-protos.h (arc_legitimate_pc_offset_p): Remove
proto.
(arc_legitimate_pic_operand_p): Likewise.
* config/arc/arc.c (arc_legitimate_pic_operand_p): Remove
function.
(arc_needs_pcl_p): Likewise.
(arc_legitimate_pc_offset_p): Likewise.
(arc_legitimate_pic_addr_p): Remove LABEL_REF case, as this
function is also used in constrains.md.
(arc_legitimate_constant_p): Use arc_legitimate_pic_addr_p to
validate pic constants. Handle CONST_INT, CONST_DOUBLE, MINUS and
PLUS.  Only return true/false in known cases, otherwise assert.
(arc_legitimate_address_p): Remove arc_legitimate_pic_addr_p as it
is already called in arc_legitimate_constant_p.
* config/arc/arc.h (CONSTANT_ADDRESS_P): Consider also LABEL for
pic addresses.
(LEGITIMATE_PIC_OPERAND_P): Use
arc_raw_symbolic_reference_mentioned_p function.
* config/arc/constraints.md (Cpc): Use arc_legitimate_pic_addr_p
function.
(Cal): Likewise.
(C32): Likewise.

gcc/testsuite
2017-02-24  Claudiu Zissulescu  

* gcc.target/arc/pr9000674901.c: New file.
* gcc.target/arc/pic-1.c: Likewise.
* gcc.target/arc/pr9001191897.c: Likewise.
---
 gcc/config/arc/arc-protos.h |   2 -
 gcc/config/arc/arc.c| 150 +---
 gcc/config/arc/arc.h|  11 +-
 gcc/config/arc/constraints.md   |   6 +-
 gcc/testsuite/gcc.target/arc/pic-1.c|  11 ++
 gcc/testsuite/gcc.target/arc/pr9000674901.c |  58 +++
 gcc/testsuite/gcc.target/arc/pr9001191897.c |  10 ++
 7 files changed, 136 insertions(+), 112 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/pic-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9000674901.c
 create mode 100644 gcc/testsuite/gcc.target/arc/pr9001191897.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index b436dbe..850795a 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -60,10 +60,8 @@ extern rtx arc_return_addr_rtx (int , rtx);
 extern bool check_if_valid_regno_const (rtx *, int);
 extern bool check_if_valid_sleep_operand (rtx *, int);
 extern bool arc_legitimate_constant_p (machine_mode, rtx);
-extern bool arc_legitimate_pc_offset_p (rtx);
 extern bool arc_legitimate_pic_addr_p (rtx);
 extern bool arc_raw_symbolic_reference_mentioned_p (rtx, bool);
-extern bool arc_legitimate_pic_operand_p (rtx);
 extern bool arc_is_longcall_p (rtx);
 extern bool arc_is_shortcall_p (rtx);
 extern bool valid_brcc_with_delay_p (rtx *);
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 7dfc68e..89de6cd 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -249,7 +249,6 @@ static rtx arc_expand_builtin (tree, rtx, rtx, 
machine_mode, int);
 static int branch_dest (rtx);
 
 static void  arc_output_pic_addr_const (FILE *,  rtx, int);
-bool arc_legitimate_pic_operand_p (rtx);
 static bool arc_function_ok_for_sibcall (tree, tree);
 static rtx arc_function_value (const_tree, const_tree, bool);
 const char * output_shift (rtx *);
@@ -5152,57 +5151,6 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 }
 }
 
-/* Helper used by arc_legitimate_pc_offset_p.  */
-
-static bool
-arc_needs_pcl_p (rtx x)
-{
-  register const char *fmt;
-  register int i, j;
-
-  if ((GET_CODE (x) == UNSPEC)
-  && (XVECLEN (x, 0) == 1)
-  && (GET_CODE (XVECEXP (x, 0, 0)) == SYMBOL_REF))
-switch (XINT (x, 1))
-  {
-  case ARC_UNSPEC_GOT:
-  case ARC_UNSPEC_GOTOFFPC:
-  case UNSPEC_TLS_GD:
-  case UNSPEC_TLS_IE:
-   return true;
-  default:
-   break;
-  }
-
-  fmt = GET_RTX_FORMAT (GET_CODE (x));
-  for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--)
-{
-  if (fmt[i] == 'e')
-   {
- if (arc_needs_pcl_p (XEXP (x, i)))
-   return true;
-   }
-  else if (fmt[i] == 'E')
-   for (j = XVECLEN (x, i) - 1; j >= 0; j--)
- if (arc_needs_pcl_p (XVECEXP (x, i, j)))
-   return true;
-}
-
-  return false;
-}
-
-/* Return true if ADDR is an address that needs to be expressed as an
-   explicit sum of pcl + offset.  */
-
-bool
-arc_legitimate_pc_offset_p (rtx addr)
-{
-  if (GET_CODE (addr) != CONST)
-return false;
-
-  return arc_needs_pcl_p (addr);
-}
-
 /* Return true if ADDR is a valid pic address.
A valid pic address on arc should look like
const (unspec (SYMBOL_REF/LABEL) (ARC_UNSPEC_GOTOFF/ARC_UNSPEC_GOT))  */
@@ -5210,8 +5

Re: [PATCH] DWARF: add DW_AT_location for global decls with DECL_VALUE_EXPR

2017-06-01 Thread Richard Biener

On Thu, Jun 1, 2017 at 11:28 AM, Pierre-Marie de Rodat
 wrote:
> Hi,
>
> In GNAT, we materialize renamings that cannot be described in standard
> DWARF as synthetic variables that describe how to fetch the renamed
> object.  Look for "___XR" in gcc/ada/exp_dbug.ads for more details about
> this convention.
>
> In order to have a location for these variables in the debug info (GDB
> requires it not to discard the variable) but also to avoid allocating
> runtime space for them, we make these variable hold a DECL_VALUE_EXPR
> tree.  However, since GCC 7, the DWARF back-end no longer generates a
> DW_AT_location attribute for those.  This patch is an attempt to restore
> this attribute.
>
> Bootstrapped and reg-tested on x86_64-linux.  Also, I have a ~150 bytes
> increase in the size of cc1, cc1plus and gnat1 (each of these is ~200MB
> large).  Ok to commit?  Thank you in advance!

Ok.

Richard.

> gcc/
>
> * dwarf2out.c (dwarf2out_late_global_decl): Add locations for
> symbols that hold a DECL_VALUE_EXPR.
>
> gcc/testsuite/
>
> * debug12.adb, debug12.ads: New testcase.
> ---
>  gcc/dwarf2out.c   | 5 +++--
>  gcc/testsuite/gnat.dg/debug12.adb | 9 +
>  gcc/testsuite/gnat.dg/debug12.ads | 8 
>  3 files changed, 20 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gnat.dg/debug12.adb
>  create mode 100644 gcc/testsuite/gnat.dg/debug12.ads
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 5ff45eb4efd..013c902bc89 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -25526,9 +25526,10 @@ dwarf2out_late_global_decl (tree decl)
> {
>   /* We get called via the symtab code invoking late_global_decl
>  for symbols that are optimized out.  Do not add locations
> -for those.  */
> +for those, except if they have a DECL_VALUE_EXPR, in which case
> +they are relevant for debuggers.  */
>   varpool_node *node = varpool_node::get (decl);
> - if (! node || ! node->definition)
> + if ((! node || ! node->definition) && ! DECL_HAS_VALUE_EXPR_P 
> (decl))
> tree_add_const_value_attribute_for_decl (die, decl);
>   else
> add_location_or_const_value_attribute (die, decl, false);
> diff --git a/gcc/testsuite/gnat.dg/debug12.adb 
> b/gcc/testsuite/gnat.dg/debug12.adb
> new file mode 100644
> index 000..07175968703
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/debug12.adb
> @@ -0,0 +1,9 @@
> +--  { dg-options "-cargs -gdwarf-4 -fdebug-types-section -dA -margs" }
> +--  { dg-final { scan-assembler-times "DW_AT_location" 4 } }
> +
> +package body Debug12 is
> +   function Get_A2 return Boolean is
> +   begin
> +  return A2;
> +   end Get_A2;
> +end Debug12;
> diff --git a/gcc/testsuite/gnat.dg/debug12.ads 
> b/gcc/testsuite/gnat.dg/debug12.ads
> new file mode 100644
> index 000..dbc5896cc73
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/debug12.ads
> @@ -0,0 +1,8 @@
> +package Debug12 is
> +   type Bit_Array is array (Positive range <>) of Boolean
> +  with Pack;
> +   A  : Bit_Array := (1 .. 10 => False);
> +   A2 : Boolean renames A (2);
> +
> +   function Get_A2 return Boolean;
> +end Debug12;
> --
> 2.13.0
>

Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

2017-06-01 Thread Kyrill Tkachov


Ping.

Thanks,
Kyrill

On 08/05/17 11:59, Kyrill Tkachov wrote:

Ping.

Thanks,
Kyrill

On 24/04/17 10:37, Kyrill Tkachov wrote:

Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html

Thanks,
Kyrill

On 28/02/17 12:29, Kyrill Tkachov wrote:

Hi all,

For the testcase in this patch we currently generate:
foo:
mov w1, 0
ldaxr   w2, [x0]
cmp w2, 3
bne .L2
stxrw3, w1, [x0]
cmp w3, 0
.L2:
csetw0, eq
ret

Note that the STXR could have been storing the WZR register instead of moving 
zero into w1.
This is due to overly strict predicates and constraints in the store exclusive 
pattern and the
atomic compare exchange expanders and splitters.
This simple patch fixes that in the patterns concerned and with it we can 
generate:
foo:
ldaxr   w1, [x0]
cmp w1, 3
bne .L2
stxrw2, wzr, [x0]
cmp w2, 0
.L2:
csetw0, eq
ret


Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

Thanks,
Kyrill

2017-02-28  Kyrylo Tkachov  

* config/aarch64/atomics.md (atomic_compare_and_swap expander):
Use aarch64_reg_or_zero predicate for operand 4.
(aarch64_compare_and_swap define_insn_and_split):
Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
(aarch64_store_exclusive): Likewise for operand 2.

2017-02-28  Kyrylo Tkachov  

* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.

Re: [PATCH] DWARF: add DW_AT_location for global decls with DECL_VALUE_EXPR

2017-06-01 Thread Pierre-Marie de Rodat


On 06/01/2017 03:53 PM, Richard Biener wrote:

Ok.

Richard.

Committed. Thank you, Richard!

--
Pierre-Marie de Rodat

Re: [PATCH,DWARF,v2] AIX dwarf2out label fix PING

2017-06-01 Thread David Edelsohn

Ping

https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01440.html

Thanks, David

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-06-01 Thread Dominique d'Humières


> Le 1 juin 2017 à 11:30, Dominique d'Humières  a écrit :
> 
> 
>> Le 31 mai 2017 à 21:03, Nicolas Koenig  a écrit :
>> 
>> Hello Dominique,
>> 
>> attached is the next try, this time without stupidities (I hope). Both test 
>> cases you posted don't ICE anymore.
>> 
>> Ok for trunk?
>> 
>> Nicolas
>> 
> 
> Preliminary tests look OK, full testing in progress.
> 
> Thanks,
> 
> Dominique
> 

I see

FAIL: gfortran.dg/deferred_character_2.f90   -O1  execution test
FAIL: gfortran.dg/deferred_character_2.f90   -O2  execution test
FAIL: gfortran.dg/deferred_character_2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/deferred_character_2.f90   -O3 -g  execution test
FAIL: gfortran.dg/deferred_character_2.f90   -Os  execution test

Dominique

Re: [v3] Fix cross compilation to Solaris

2017-06-01 Thread Rainer Orth

Hi Jonathan,

>>This way, config.h is identical between a native build and the cross
>>above, with the exception of HAVE_SETENV which is equally guarded with
>>GLIBCXX_IS_NATIVE in acinclude.m4 (GLIBCXX_CONFIGURE_TESTSUITE).  Maybe
>>it's time to somehow refine the GLIBCXX_IS_NATIVE check to allow cross
>>configurations that *can* perform link tests to run them?
>
> Sounds like a good idea, although I don't know how to do that.

I'm not sure either, since I couldn't easily find the origin of that
variable.  The earliest ChangeLog entry mentioning it is

2003-08-17  Phil Edwards  

[...]
* configure.ac (GLIBCXX_IS_NATIVE):  Determine earlier and re-order.
Comment out the conditionals for CANADIAN and GLIBCXX_BUILD_LIBMATH
(currently unused).  Strip the fake-VPATH shell fragment from
automake-generated rules, if present.

and it's already present in r70194 for configure.ac, the earliest after
the rename from configure.in.  No idea what happened to earlier history
before that rename: svn should be able to cope with that, I thought.

The !$GLIBCXX_IS_NATIVE branch in configure.ac explains

  # This lets us hard-code the functionality we know we'll have in the cross
  # target environment.  "Let" is a sugar-coated word placed on an especially
  # dull and tedious hack, actually.
  #
  # Here's why GLIBCXX_CHECK_MATH_SUPPORT, and other autoconf macros
  # that involve linking, can't be used:
  #"cannot open sim-crt0.o"
  #"cannot open crt0.o"
  # etc.  All this is because there currently exists no unified, consistent
  # way for top level CC information to be passed down to target directories:
  # newlib includes, newlib linking info, libgloss versus newlib crt0.o, etc.
  # When all of that is done, all of this hokey, excessive AC_DEFINE junk for
  # crosses can be removed.

which suggests this is primarily an issue for builds done in a unified
tree, with gcc, binutils, newlib etc. all thrown in at once, i.e. for
embedded targets.

ISTM that one should just be able to actually *do* a link test of an
empty main, see if it works and decide from there if link tests are
possible or not.

Me only very rarely doing crosses at all, and then mostly only building
cc1/cc1plus, am certainly not a good person to try this, though ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Patch, fortran] PR35339 Optimize implied do loops in io statements

2017-06-01 Thread Dominique d'Humières


> Le 1 juin 2017 à 16:19, Dominique d'Humières  a écrit :
> 
> I see
> 
> FAIL: gfortran.dg/deferred_character_2.f90   -O1  execution test
> FAIL: gfortran.dg/deferred_character_2.f90   -O2  execution test
> FAIL: gfortran.dg/deferred_character_2.f90   -O3 -fomit-frame-pointer 
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL: gfortran.dg/deferred_character_2.f90   -O3 -g  execution test
> FAIL: gfortran.dg/deferred_character_2.f90   -Os  execution test
> 
> Dominique

Reduced test

PROGRAM hello

IMPLICIT NONE

CHARACTER(LEN=:),DIMENSION(:),ALLOCATABLE :: array_lineas
CHARACTER(LEN=:),DIMENSION(:),ALLOCATABLE :: array_copia
character (3), dimension (2) :: array_fijo = ["abc","def"]
character (100) :: buffer
INTEGER :: largo , cant_lineas , i

write (buffer, "(2a3)") array_fijo

largo = LEN (array_fijo)

cant_lineas = size (array_fijo, 1)

ALLOCATE(CHARACTER(LEN=largo) :: array_lineas(cant_lineas))

READ(buffer,"(2a3)") (array_lineas(i),i=1,cant_lineas)

print *, array_lineas
print *, array_fijo
 if (any (array_lineas .ne. array_fijo)) call abort

END PROGRAM

Dominique

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Jonathan Wakely


On 30/05/17 02:16 -0700, Tim Shen via libstdc++ wrote:

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index b9824a5182c..f81b815af09 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -290,6 +290,53 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __ref_cast<_Tp>(__t));
}

+  template
+struct _Traits
+{
+  static constexpr bool is_default_constructible_v =
+  is_default_constructible_v::type>;
+  static constexpr bool is_copy_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_move_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_copy_assignable_v =
+  is_copy_constructible_v && is_move_constructible_v
+  && __and_...>::value;
+  static constexpr bool is_move_assignable_v =
+  is_move_constructible_v
+  && __and_...>::value;


It seems strange to me that these ones end with _v but the following
ones don't. Could we make them all have no _v suffix?


+  static constexpr bool is_dtor_trivial =
+  __and_...>::value;
+  static constexpr bool is_copy_ctor_trivial =
+  __and_...>::value;
+  static constexpr bool is_move_ctor_trivial =
+  __and_...>::value;
+  static constexpr bool is_copy_assign_trivial =
+  is_dtor_trivial
+  && is_copy_ctor_trivial
+  && __and_...>::value;
+  static constexpr bool is_move_assign_trivial =
+  is_dtor_trivial
+  && is_move_ctor_trivial
+  && __and_...>::value;
+
+  static constexpr bool is_default_ctor_noexcept =
+  is_nothrow_default_constructible_v<
+  typename _Nth_type<0, _Types...>::type>;
+  static constexpr bool is_copy_ctor_noexcept =
+  is_copy_ctor_trivial;
+  static constexpr bool is_move_ctor_noexcept =
+  is_move_ctor_trivial
+  || __and_...>::value;
+  static constexpr bool is_copy_assign_noexcept =
+  is_copy_assign_trivial;
+  static constexpr bool is_move_assign_noexcept =
+  is_move_assign_trivial ||
+  (is_move_ctor_noexcept
+   && __and_...>::value);
+};


Does using __and_ for any of those traits reduce the limit on the
number of alternatives in a variant? We switched to using fold
expressions in some contexts to avoid very deep instantiations, but I
don't know if these will hit the same problem, but it looks like it
will.




  // Defines members and ctors.
  template
union _Variadic_union { };
@@ -355,6 +402,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  ~_Variant_storage()
  { _M_reset(); }

+  void*
+  _M_storage() const
+  {
+   return const_cast(static_cast(
+   std::addressof(_M_u)));
+  }
+
+  constexpr bool
+  _M_valid() const noexcept
+  {
+   return this->_M_index != __index_type(variant_npos);
+  }
+
  _Variadic_union<_Types...> _M_u;
  using __index_type = __select_index<_Types...>;
  __index_type _M_index;
@@ -374,59 +434,114 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  void _M_reset()
  { _M_index = variant_npos; }

+  void*
+  _M_storage() const
+  {
+   return const_cast(static_cast(
+   std::addressof(_M_u)));
+  }
+
+  constexpr bool
+  _M_valid() const noexcept
+  {
+   return this->_M_index != __index_type(variant_npos);
+  }
+
  _Variadic_union<_Types...> _M_u;
  using __index_type = __select_index<_Types...>;
  __index_type _M_index;
};

-  // Helps SFINAE on special member functions. Otherwise it can live in variant
-  // class.
  template
-struct _Variant_base :
-  _Variant_storage<(std::is_trivially_destructible_v<_Types> && ...),
-   _Types...>
-{
-  using _Storage =
- _Variant_storage<(std::is_trivially_destructible_v<_Types> && ...),
-   _Types...>;
+using _Variant_storage_alias =
+_Variant_storage<_Traits<_Types...>::is_dtor_trivial, _Types...>;

-  constexpr
-  _Variant_base()
-  noexcept(is_nothrow_default_constructible_v<
-variant_alternative_t<0, variant<_Types...>>>)
-  : _Variant_base(in_place_index<0>) { }
+  // The following are (Copy|Move) (ctor|assign) layers for forwarding
+  // triviality and handling non-trivial SMF behaviors.

-  _Variant_base(const _Variant_base& __rhs)
+  template
+struct _Copy_ctor_base : _Variant_storage_alias<_Types...>
+{
+  using _Base = _Variant_storage_alias<_Types...>;
+  using _Base::_Base;
+
+  _Copy_ctor_base(const _Copy_ctor_base& __rhs)
+  noexcept(_Traits<_Types...>::is_copy_ctor_noexcept)
  {
if (__rhs._M_valid())
  {
static constexpr void (*_S_vtable[])(void*, void*) =
  { &__erased_ctor<_Types&, const _Types&>... };
-   _S_vtable[__rhs._M_index](_M_storage(), __rhs._M_storage());
+

Re: [PATCH, rs6000] Fold vector shifts in GIMPLE

2017-06-01 Thread Bill Schmidt


> On Jun 1, 2017, at 2:48 AM, Richard Biener  wrote:
> 
> On Wed, May 31, 2017 at 10:01 PM, Will Schmidt
>  wrote:
>> Hi,
>> 
>> Add support for early expansion of vector shifts.  Including
>> vec_sl (shift left), vec_sr (shift right), vec_sra (shift
>> right algebraic), vec_rl (rotate left).
>> Part of this includes adding the vector shift right instructions to
>> the list of those instructions having an unsigned second argument.
>> 
>> The VSR (vector shift right) folding is a bit more complex than
>> the others. This is due to requiring arg0 be unsigned for an algebraic
>> shift before the gimple RSHIFT_EXPR assignment is built.
> 
> Jakub, do we sanitize that undefinedness of left shifts of negative values
> and/or overflow of left shift of nonnegative values?
> 
> Will, how is that defined in the intrinsics operation?  It might need similar
> treatment as the abs case.

Answering for Will -- vec_sl is defined to simply shift bits off the end to the
left and fill with zeros from the right, regardless of whether the source type
is signed or unsigned.  The result type is signed iff the source type is
signed.  So a negative value can become positive as a result of the
operation.

The same is true of vec_rl, which will naturally rotate bits regardless of 
signedness.

Old but reliable reference:
http://www.nxp.com/assets/documents/data/en/reference-manuals/ALTIVECPIM.pdf

Bill

> 
> [I'd rather make the negative left shift case implementation defined
> given C and C++ standards
> do not agree to 100% AFAIK]
> 
> Richard.
> 
>> [gcc]
>> 
>> 2017-05-26  Will Schmidt  
>> 
>>* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
>>for early expansion of vector shifts (sl,sr,sra,rl).
>>(builtin_function_type): Add vector shift right instructions
>>to the unsigned argument list.
>> 
>> [gcc/testsuite]
>> 
>> 2017-05-26  Will Schmidt  
>> 
>>* testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
>>* testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
>>* testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
>>* testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.
>> 
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 8adbc06..6ee0bfd 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
>> *gsi)
>>gsi_replace (gsi, g, true);
>>return true;
>>   }
>> +/* Flavors of vec_rotate_left . */
>> +case ALTIVEC_BUILTIN_VRLB:
>> +case ALTIVEC_BUILTIN_VRLH:
>> +case ALTIVEC_BUILTIN_VRLW:
>> +case P8V_BUILTIN_VRLD:
>> +  {
>> +   arg0 = gimple_call_arg (stmt, 0);
>> +   arg1 = gimple_call_arg (stmt, 1);
>> +   lhs = gimple_call_lhs (stmt);
>> +   gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
>> +   gimple_set_location (g, gimple_location (stmt));
>> +   gsi_replace (gsi, g, true);
>> +   return true;
>> +  }
>> +  /* Flavors of vector shift right algebraic.  vec_sra{b,h,w} -> 
>> vsra{b,h,w}. */
>> +case ALTIVEC_BUILTIN_VSRAB:
>> +case ALTIVEC_BUILTIN_VSRAH:
>> +case ALTIVEC_BUILTIN_VSRAW:
>> +case P8V_BUILTIN_VSRAD:
>> +  {
>> +   arg0 = gimple_call_arg (stmt, 0);
>> +   arg1 = gimple_call_arg (stmt, 1);
>> +   lhs = gimple_call_lhs (stmt);
>> +   gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
>> +   gimple_set_location (g, gimple_location (stmt));
>> +   gsi_replace (gsi, g, true);
>> +   return true;
>> +  }
>> +   /* Flavors of vector shift left.  builtin_altivec_vsl{b,h,w} -> 
>> vsl{b,h,w}.  */
>> +case ALTIVEC_BUILTIN_VSLB:
>> +case ALTIVEC_BUILTIN_VSLH:
>> +case ALTIVEC_BUILTIN_VSLW:
>> +case P8V_BUILTIN_VSLD:
>> +  {
>> +   arg0 = gimple_call_arg (stmt, 0);
>> +   arg1 = gimple_call_arg (stmt, 1);
>> +   lhs = gimple_call_lhs (stmt);
>> +   gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
>> +   gimple_set_location (g, gimple_location (stmt));
>> +   gsi_replace (gsi, g, true);
>> +   return true;
>> +  }
>> +/* Flavors of vector shift right. */
>> +case ALTIVEC_BUILTIN_VSRB:
>> +case ALTIVEC_BUILTIN_VSRH:
>> +case ALTIVEC_BUILTIN_VSRW:
>> +case P8V_BUILTIN_VSRD:
>> +  {
>> +   arg0 = gimple_call_arg (stmt, 0);
>> +   arg1 = gimple_call_arg (stmt, 1);
>> +   lhs = gimple_call_lhs (stmt);
>> +   gimple *g;
>> +   /* convert arg0 to unsigned */
>> +   arg0 = convert(unsigned_type_for(TREE_TYPE(arg0)),arg0);
>> +   tree arg0_uns = 
>> create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(arg0)));
>> +   g = gimple_build_assign(arg0_uns,arg0);
>> +   gimple_set_location (g, gimple_location (stmt));
>> +   gsi_insert_before (gsi, g, GSI_SAME_STMT);
>> +   /* convert lhs to unsigned and do the shift.

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Ville Voutilainen

On 1 June 2017 at 18:13, Jonathan Wakely  wrote:
> On 30/05/17 02:16 -0700, Tim Shen via libstdc++ wrote:
>>
>> diff --git a/libstdc++-v3/include/std/variant
>> b/libstdc++-v3/include/std/variant
>> index b9824a5182c..f81b815af09 100644
>> --- a/libstdc++-v3/include/std/variant
>> +++ b/libstdc++-v3/include/std/variant
>> @@ -290,6 +290,53 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>   __ref_cast<_Tp>(__t));
>> }
>>
>> +  template
>> +struct _Traits
>> +{
>> +  static constexpr bool is_default_constructible_v =
>> +  is_default_constructible_v> _Types...>::type>;
>> +  static constexpr bool is_copy_constructible_v =
>> +  __and_...>::value;
>> +  static constexpr bool is_move_constructible_v =
>> +  __and_...>::value;
>> +  static constexpr bool is_copy_assignable_v =
>> +  is_copy_constructible_v && is_move_constructible_v
>> +  && __and_...>::value;
>> +  static constexpr bool is_move_assignable_v =
>> +  is_move_constructible_v
>> +  && __and_...>::value;
>
>
> It seems strange to me that these ones end with _v but the following
> ones don't. Could we make them all have no _v suffix?

Seems to me worth considering to rather make all of them have a _v suffix. :)
>
>> +  static constexpr bool is_dtor_trivial =
>> +  __and_...>::value;


They all seem to be shortcuts for something::value, so it seems to me
logical to have
them all be _v.

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Jonathan Wakely


On 01/06/17 18:21 +0300, Ville Voutilainen wrote:

On 1 June 2017 at 18:13, Jonathan Wakely  wrote:

On 30/05/17 02:16 -0700, Tim Shen via libstdc++ wrote:


diff --git a/libstdc++-v3/include/std/variant
b/libstdc++-v3/include/std/variant
index b9824a5182c..f81b815af09 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -290,6 +290,53 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __ref_cast<_Tp>(__t));
}

+  template
+struct _Traits
+{
+  static constexpr bool is_default_constructible_v =
+  is_default_constructible_v::type>;
+  static constexpr bool is_copy_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_move_constructible_v =
+  __and_...>::value;
+  static constexpr bool is_copy_assignable_v =
+  is_copy_constructible_v && is_move_constructible_v
+  && __and_...>::value;
+  static constexpr bool is_move_assignable_v =
+  is_move_constructible_v
+  && __and_...>::value;



It seems strange to me that these ones end with _v but the following
ones don't. Could we make them all have no _v suffix?


Seems to me worth considering to rather make all of them have a _v suffix. :)



+  static constexpr bool is_dtor_trivial =
+  __and_...>::value;



They all seem to be shortcuts for something::value, so it seems to me
logical to have
them all be _v.


The _v suffixes in the standard are there to distinguish std::foo from
std::foo_v, but we don't have that problem.

__variant::_Traits::foo is a unique name, we don't need the
suffix, it's just noise.

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Ville Voutilainen

On 1 June 2017 at 18:29, Jonathan Wakely  wrote:
>> They all seem to be shortcuts for something::value, so it seems to me
>> logical to have
>> them all be _v.
>
>
> The _v suffixes in the standard are there to distinguish std::foo from
> std::foo_v, but we don't have that problem.

Wouldn't necessarily hurt to follow the same naming convention idea as
the standard, but sure, we
don't have that problem, agreed.

[PING**3] [PATCH] Force use of absolute path names for gcov

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:47, Bernd Edlinger wrote:
> Ping...
> 
> On 04/28/17 19:41, Bernd Edlinger wrote:
>> Ping...
>>
>> I attached a rebased patch file, with the doc changes and
>> merge conflicts with trunk of today fixed, but otherwise
>> identical.
>>
>>
>> Thanks
>> Bernd.
>>
>> On 04/21/17 22:26, Bernd Edlinger wrote:
>>>
>>>
>>> On 04/21/17 21:50, Joseph Myers wrote:
 On Fri, 21 Apr 2017, Bernd Edlinger wrote:

> So I would like to add a -fprofile-abs-path option that
> forces absolute path names in gcno files, which allows gcov
> to get the true canonicalized source name.

 I don't see any actual documentation of this option in the patch (you
 add
 it to the summary list of options, and mention it in text under the
 documentation of --coverage, but don't have any actual @item
 -fprofile-abs-path / @opindex fprofile-abs-path paragraph with text
 describing what the option does).

>>>
>>> Ah yes, thanks.
>>>
>>> So I'll add one more sentence to invoke.texi:
>>>
>>> @@ -10696,6 +10713,12 @@
>>>  generate test coverage data.  Coverage data matches the source files
>>>  more closely if you do not optimize.
>>>
>>> +@item -fprofile-abs-path
>>> +@opindex fprofile-abs-path
>>> +Automatically convert relative source file names to absolute path names
>>> +in the @file{.gcno} files.  This allows @command{gcov} to find the
>>> correct
>>> +sources in projects with multiple directories.
>>> +
>>>  @item -fprofile-dir=@var{path}
>>>  @opindex fprofile-dir
>>>
>>>
>>>
>>>
>>> Bernd.

[PING**2][PATCH][ PR rtl-optimization/79286] Drop may_trap_p exception to testing dominance in update_equiv_regs

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:48, Bernd Edlinger wrote:
> Ping...
> 
> On 04/29/17 09:06, Bernd Edlinger wrote:
>> On 04/28/17 20:46, Jeff Law wrote:
>>> On 04/28/2017 11:27 AM, Bernd Edlinger wrote:
>

 Yes I agree, that is probably not worth it.  So I could try to remove
 the special handling of PIC+const and see what happens.

 However the SYMBOL_REF_FUNCTION_P is another story, that part I would
 like to keep: It happens quite often, already w/o -fpic that call
 statements are using SYMBOL_REFs to ordinary (not weak) function
 symbols, and may_trap returns 1 for these call statements wihch is IMHO
 wrong.
>>> Hmm, thinking more about this, wasn't the original case a PIC referrence
>>> for something like &x[BIGNUM].
>>>
>>> Perhaps we could consider a PIC reference without other arithmetic as
>>> safe.  That would likely pick up the SYMBOL_REF_FUNCTION_P case you want
>>> as well good deal many more PIC references as non-trapping.
>>>
>>
>> Yes, I like this idea.
>>
>> I tried to compile openssl with -m32 -fpic as an example, and counted
>> how often the mem[pic+const] is hit: that was 2353 times, all kind of
>> object refs.
>>
>> Then I tried your idea, and only 54 unhandled pic refs remained, all of
>> them looking like this:
>>
>> (plus:SI (reg:SI 107)
>> (const:SI (plus:SI (unspec:SI [
>> (symbol_ref:SI ("bf_init") [flags 0x2] > 0x2ac00f7bac60 bf_init>)
>> ] UNSPEC_GOTOFF)
>> (const_int 4164 [0x1044]
>>
>> I believe that is a negligible fall out from such a big code base.
>>
>> Although the pic references do no longer reach the
>> SYMBOL_REF_FUNCTION_P in this version of the patch, I still see
>> that happening without -fpic option, so I left it as is.
>>
>>
>> Attached is the new version of my patch.
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.

[PING**4] [PATCH, ARM] correctly encode the CC reg data flow

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:49, Bernd Edlinger wrote:
> Ping...
> 
> On 04/29/17 19:21, Bernd Edlinger wrote:
>> Ping...
>>
>> On 04/20/17 20:11, Bernd Edlinger wrote:
>>> Ping...
>>>
>>> for this patch:
>>> https://gcc.gnu.org/ml/gcc-patches/2017-01/msg01351.html
>>>
>>> On 01/18/17 16:36, Bernd Edlinger wrote:
 On 01/13/17 19:28, Bernd Edlinger wrote:
> On 01/13/17 17:10, Bernd Edlinger wrote:
>> On 01/13/17 14:50, Richard Earnshaw (lists) wrote:
>>> On 18/12/16 12:58, Bernd Edlinger wrote:
 Hi,

 this is related to PR77308, the follow-up patch will depend on this
 one.

 When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned
 before reload, a mis-compilation in libgcc function
 __gnu_satfractdasq
 was discovered, see [1] for more details.

 The reason seems to be that when the *arm_cmpdi_insn is directly
 followed by a *arm_cmpdi_unsigned instruction, both are split
 up into this:

[(set (reg:CC CC_REGNUM)
  (compare:CC (match_dup 0) (match_dup 1)))
 (parallel [(set (reg:CC CC_REGNUM)
 (compare:CC (match_dup 3) (match_dup 4)))
(set (match_dup 2)
 (minus:SI (match_dup 5)
  (ltu:SI (reg:CC_C CC_REGNUM) 
 (const_int
 0])]

[(set (reg:CC CC_REGNUM)
  (compare:CC (match_dup 2) (match_dup 3)))
 (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0))
(set (reg:CC CC_REGNUM)
 (compare:CC (match_dup 0) (match_dup 1]

 The problem is that the reg:CC from the *subsi3_carryin_compare
 is not mentioning that the reg:CC is also dependent on the reg:CC
 from before.  Therefore the *arm_cmpsi_insn appears to be
 redundant and thus got removed, because the data values are
 identical.

 I think that applies to a number of similar pattern where data
 flow is happening through the CC reg.

 So this is a kind of correctness issue, and should be fixed
 independently from the optimization issue PR77308.

 Therefore I think the patterns need to specify the true
 value that will be in the CC reg, in order for cse to
 know what the instructions are really doing.


 Bootstrapped and reg-tested on arm-linux-gnueabihf.
 Is it OK for trunk?

>>>
>>> I agree you've found a valid problem here, but I have some issues
>>> with
>>> the patch itself.
>>>
>>>
>>> (define_insn_and_split "subdi3_compare1"
>>>   [(set (reg:CC_NCV CC_REGNUM)
>>> (compare:CC_NCV
>>>   (match_operand:DI 1 "register_operand" "r")
>>>   (match_operand:DI 2 "register_operand" "r")))
>>>(set (match_operand:DI 0 "register_operand" "=&r")
>>> (minus:DI (match_dup 1) (match_dup 2)))]
>>>   "TARGET_32BIT"
>>>   "#"
>>>   "&& reload_completed"
>>>   [(parallel [(set (reg:CC CC_REGNUM)
>>>(compare:CC (match_dup 1) (match_dup 2)))
>>>   (set (match_dup 0) (minus:SI (match_dup 1) (match_dup
>>> 2)))])
>>>(parallel [(set (reg:CC_C CC_REGNUM)
>>>(compare:CC_C
>>>  (zero_extend:DI (match_dup 4))
>>>  (plus:DI (zero_extend:DI (match_dup 5))
>>>   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
>>>   (set (match_dup 3)
>>>(minus:SI (minus:SI (match_dup 4) (match_dup 5))
>>>  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0])]
>>>
>>>
>>> This pattern is now no-longer self consistent in that before the
>>> split
>>> the overall result for the condition register is in mode CC_NCV, but
>>> afterwards it is just CC_C.
>>>
>>> I think CC_NCV is correct mode (the N, C and V bits all correctly
>>> reflect the result of the 64-bit comparison), but that then implies
>>> that
>>> the cc mode of subsi3_carryin_compare is incorrect as well and
>>> should in
>>> fact also be CC_NCV.  Thinking about this pattern, I'm inclined to
>>> agree
>>> that CC_NCV is the correct mode for this operation
>>>
>>> I'm not sure if there are other consequences that will fall out from
>>> fixing this (it's possible that we might need a change to
>>> select_cc_mode
>>> as well).
>>>
>>
>> Yes, this is still a bit awkward...
>>
>> The N and V bit will be the correct result for the subdi3_compare1
>> a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...)
>> only gets the C bit correct, the expression for N and V is a 
>> different
>> one.
>>
>> It probably works, because the

[PING**4] [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:49, Bernd Edlinger wrote:
> Ping...
> 
> On 04/29/17 19:45, Bernd Edlinger wrote:
>> Ping...
>>
>> I attached a rebased version since there was a merge conflict in
>> the xordi3 pattern, otherwise the patch is still identical.
>> It splits adddi3, subdi3, anddi3, iordi3, xordi3 and one_cmpldi2
>> early when the target has no neon or iwmmxt.
>>
>>
>> Thanks
>> Bernd.
>>
>>
>>
>> On 11/28/16 20:42, Bernd Edlinger wrote:
>>> On 11/25/16 12:30, Ramana Radhakrishnan wrote:
 On Sun, Nov 6, 2016 at 2:18 PM, Bernd Edlinger
  wrote:
> Hi!
>
> This improves the stack usage on the sha512 test case for the case
> without hardware fpu and without iwmmxt by splitting all di-mode
> patterns right while expanding which is similar to what the
> shift-pattern
> does.  It does nothing in the case iwmmxt and fpu=neon or vfp as
> well as
> thumb1.
>

 I would go further and do this in the absence of Neon, the VFP unit
 being there doesn't help with DImode operations i.e. we do not have 64
 bit integer arithmetic instructions without Neon. The main reason why
 we have the DImode patterns split so late is to give a chance for
 folks who want to do 64 bit arithmetic in Neon a chance to make this
 work as well as support some of the 64 bit Neon intrinsics which IIRC
 map down to these instructions. Doing this just for soft-float doesn't
 improve the default case only. I don't usually test iwmmxt and I'm not
 sure who has the ability to do so, thus keeping this restriction for
 iwMMX is fine.


>>>
>>> Yes I understand, thanks for pointing that out.
>>>
>>> I was not aware what iwmmxt exists at all, but I noticed that most
>>> 64bit expansions work completely different, and would break if we split
>>> the pattern early.
>>>
>>> I can however only look at the assembler outout for iwmmxt, and make
>>> sure that the stack usage does not get worse.
>>>
>>> Thus the new version of the patch keeps only thumb1, neon and iwmmxt as
>>> it is: around 1570 (thumb1), 2300 (neon) and 2200 (wimmxt) bytes stack
>>> for the test cases, and vfp and soft-float at around 270 bytes stack
>>> usage.
>>>
> It reduces the stack usage from 2300 to near optimal 272 bytes (!).
>
> Note this also splits many ldrd/strd instructions and therefore I will
> post a followup-patch that mitigates this effect by enabling the
> ldrd/strd
> peephole optimization after the necessary reg-testing.
>
>
> Bootstrapped and reg-tested on arm-linux-gnueabihf.

 What do you mean by arm-linux-gnueabihf - when folks say that I
 interpret it as --with-arch=armv7-a --with-float=hard
 --with-fpu=vfpv3-d16 or (--with-fpu=neon).

 If you've really bootstrapped and regtested it on armhf, doesn't this
 patch as it stand have no effect there i.e. no change ?
 arm-linux-gnueabihf usually means to me someone has configured with
 --with-float=hard, so there are no regressions in the hard float ABI
 case,

>>>
>>> I know it proves little.  When I say arm-linux-gnueabihf
>>> I do in fact mean --enable-languages=all,ada,go,obj-c++
>>> --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16
>>> --with-float=hard.
>>>
>>> My main interest in the stack usage is of course not because of linux,
>>> but because of eCos where we have very small task stacks and in fact
>>> no fpu support by the O/S at all, so that patch is exactly what we need.
>>>
>>>
>>> Bootstrapped and reg-tested on arm-linux-gnueabihf
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.

[PING**3] [PATCH, ARM] Further improve stack usage in sha512, part 2 (PR 77308)

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:50, Bernd Edlinger wrote:
> Ping...
> 
> On 04/29/17 19:52, Bernd Edlinger wrote:
>> Ping...
>>
>> I attached the latest version of my patch.
>>
>>
>> Thanks
>> Bernd.
>>
>> On 12/18/16 14:14, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this splits the *arm_negdi2, *arm_cmpdi_insn and *arm_cmpdi_unsigned
>>> also at split1 except for TARGET_NEON and TARGET_IWMMXT.
>>>
>>> In the new test case the stack is reduced to about 270 bytes, except
>>> for neon and iwmmxt, where this does not change anything.
>>>
>>> This patch depends on [1] and [2] before it can be applied.
>>>
>>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>>
>>>
>>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02796.html
>>> [2] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01562.html

[arm-embedded] [PATCH, GCC, ARM/embedded-6/7-branch] Set mode for success result of atomic compare and swap

2017-06-01 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the embedded-6-branch and 
embedded-7-branch to fix a genrecog warning when processing sync.md.


ChangeLog entry is as follows:

2017-05-03  Thomas Preud'homme  

Backport from mainline
2017-05-03  Thomas Preud'homme  

gcc/
* config/arm/iterators.md (CCSI): New mode iterator.
(arch): New mode attribute.
* config/arm/sync.md (atomic_compare_and_swap_1): Rename into ...
(atomic_compare_and_swap_1): This and ...
(atomic_compare_and_swap_1): This.  Use CCSI
code iterator for success result mode.
* config/arm/arm.c (arm_expand_compare_and_swap): Adapt code to use
the corresponding new insn generators.

Best regards,

Thomas

On 03/05/17 10:40, Kyrill Tkachov wrote:

Hi Thomas,

On 03/05/17 10:39, Thomas Preudhomme wrote:

Hi Kyrill,

On 19/04/17 14:34, Kyrill Tkachov wrote:

Hi Thomas,

On 12/04/17 09:59, Thomas Preudhomme wrote:

Hi,

Currently atomic_compare_and_swap_1 define_insn do not have a mode
set for the destination of the set indicating the success result of the
instruction. This is because the operand can be either a CC_Z register
(for 32-bit targets) or a SI register (for 16-bit Thumb targets). This
result in lack of checking for the mode.

This commit use a new CCSI iterator to solve this issue while avoiding
duplication of the patterns. The insn name are kept unique by using
attributes tied to the iterator (SIDI:mode and CCSI:arch) instead of
usign the builtin mode attribute. Expander arm_expand_compare_and_swap
is also adapted accordingly.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-04-11  Thomas Preud'homme 

* config/arm/iterators.md (CCSI): New mode iterator.
(arch): New mode attribute.
* config/arm/sync.md (atomic_compare_and_swap_1): Rename into ...
(atomic_compare_and_swap_1): This and ...
(atomic_compare_and_swap_1): This.  Use CCSI
code iterator for success result mode.
* config/arm/arm.c (arm_expand_compare_and_swap): Adapt code to use
the corresponding new insn generators.

Testing: arm-none-eabi cross-compiler built successfully for ARMv8-M
Mainline and Baseline without the lack of destination mode warning in
sync.md. Testsuite show no regression.



Thanks for fixing these warnings.
The code looks ok to me but
I'd like to make sure that the rest of the arm atomic targets are not adversely
affected,
so please also do a test run for ARMv7-A and ARMv8-A targets.
Also, a bootstrap is required as always.


Hi Kyrill,

Bootstrapped and ran the testsuite for both ARMv7-A and ARMv8-A in both ARM
and Thumb mode without any regression. I've also verified that a number of
atomic related testcases [1][2] get the same code generation for ARMv7-A in
ARM and Thumb mode as well as ARMv8-M Baseline.

[1] For ARMv7-A ARM and Thumb mode, the following testcases were considered:

gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
gcc/testsuite/gcc.dg/atomic-exchange-1.c
gcc/testsuite/gcc.dg/atomic-exchange-2.c
gcc/testsuite/gcc.dg/atomic-exchange-3.c
gcc/testsuite/gcc.dg/atomic-fence.c
gcc/testsuite/gcc.dg/atomic-flag.c
gcc/testsuite/gcc.dg/atomic-generic.c
gcc/testsuite/gcc.dg/atomic-generic-aux.c
gcc/testsuite/gcc.dg/atomic-invalid-2.c
gcc/testsuite/gcc.dg/atomic-load-1.c
gcc/testsuite/gcc.dg/atomic-load-2.c
gcc/testsuite/gcc.dg/atomic-load-3.c
gcc/testsuite/gcc.dg/atomic-lockfree.c
gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
gcc/testsuite/gcc.dg/atomic-noinline.c
gcc/testsuite/gcc.dg/atomic-noinline-aux.c
gcc/testsuite/gcc.dg/atomic-op-1.c
gcc/testsuite/gcc.dg/atomic-op-2.c
gcc/testsuite/gcc.dg/atomic-op-3.c
gcc/testsuite/gcc.dg/atomic-op-6.c
gcc/testsuite/gcc.dg/atomic-store-1.c
gcc/testsuite/gcc.dg/atomic-store-2.c
gcc/testsuite/gcc.dg/atomic-store-3.c
gcc/testsuite/g++.dg/ext/atomic-1.C
gcc/testsuite/g++.dg/ext/atomic-2.C
gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire-1.c
gcc/testsuite/gcc.target/arm/atomic-op-acq_rel-1.c
gcc/testsuite/gcc.target/arm/atomic-op-acquire-1.c
gcc/testsuite/gcc.target/arm/atomic-op-char-1.c
gcc/testsuite/gcc.target/arm/atomic-op-consume-1.c
gcc/testsuite/gcc.target/arm/atomic-op-int-1.c
gcc/testsuite/gcc.target/arm/atomic-op-relaxed-1.c
gcc/testsuite/gcc.target/arm/atomic-op-release-1.c
gcc/testsuite/gcc.target/arm/atomic-op-seq_cst-1.c
gcc/testsuite/gcc.target/arm/atomic-op-short-1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
gcc/testsuite/gcc.target/arm/sync-1.c
gcc/testsuite/gcc.target/arm/synchronize.c
gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
libstdc++-v3/testsu

[PING**2] [PATCH] Implement a warning for bogus sizeof(pointer) / sizeof(pointer[0])

2017-06-01 Thread Bernd Edlinger

Ping...

On 05/12/17 18:55, Bernd Edlinger wrote:
> Ping for the C changes.
> 
> Thanks
> Bernd.
> 
> On 05/03/17 15:14, Jason Merrill wrote:
>> On Tue, May 2, 2017 at 9:26 AM, Bernd Edlinger
>>  wrote:
>>> On 05/01/17 17:54, Jason Merrill wrote:
 On Fri, Apr 28, 2017 at 1:05 PM, Bernd Edlinger
  wrote:
> On 04/28/17 17:29, Martin Sebor wrote:
>> On 04/28/2017 08:12 AM, Bernd Edlinger wrote:
>>>
>>> Do you want me to change the %qT format strings to %T ?
>>
>> Yes, with the surrounding %< and %> the nested directives should
>> use the unquoted forms, otherwise the printer would end up quoting
>> both the whole expression and the type operand.
>>
>> FWIW, to help avoid this mistake, I think this might be something
>> for GCC -Wformat to warn on and the pretty-printer to detect (and
>> ICE on).
>>
>
> Ah, now I understand.  That's pretty advanced.
>
> Here is the modified patch with correct quoting of the expression.
>
> Bootstrap and reg-testing on x86_64-pc-linux-gnu.

> * cp-gimplify.c (cp_fold): Implement the -Wsizeof_pointer_div warning.

 I think this warning belongs in cp_build_binary_op rather than cp_fold.

>>>
>>> Done, as suggested.
>>
>> The pattern in that function is to treat all *_DIV_EXPR the same; I
>> don't think we need to break that pattern with this patch.  So please
>> move the new code after the other DIV case labels.  With that the C++
>> changes are OK.
>>
>> Jason
>>
> 
> 
> On 05/03/17 15:14, Jason Merrill wrote:
>  > On Tue, May 2, 2017 at 9:26 AM, Bernd Edlinger
>  >  wrote:
>  >> On 05/01/17 17:54, Jason Merrill wrote:
>  >>> On Fri, Apr 28, 2017 at 1:05 PM, Bernd Edlinger
>  >>>  wrote:
>   On 04/28/17 17:29, Martin Sebor wrote:
>  > On 04/28/2017 08:12 AM, Bernd Edlinger wrote:
>  >>
>  >> Do you want me to change the %qT format strings to %T ?
>  >
>  > Yes, with the surrounding %< and %> the nested directives should
>  > use the unquoted forms, otherwise the printer would end up quoting
>  > both the whole expression and the type operand.
>  >
>  > FWIW, to help avoid this mistake, I think this might be something
>  > for GCC -Wformat to warn on and the pretty-printer to detect (and
>  > ICE on).
>  >
>  
>   Ah, now I understand.  That's pretty advanced.
>  
>   Here is the modified patch with correct quoting of the expression.
>  
>   Bootstrap and reg-testing on x86_64-pc-linux-gnu.
>  >>>
>   * cp-gimplify.c (cp_fold): Implement the -Wsizeof_pointer_div 
> warning.
>  >>>
>  >>> I think this warning belongs in cp_build_binary_op rather than 
> cp_fold.
>  >>>
>  >>
>  >> Done, as suggested.
>  >
>  > The pattern in that function is to treat all *_DIV_EXPR the same; I
>  > don't think we need to break that pattern with this patch.  So please
>  > move the new code after the other DIV case labels.  With that the C++
>  > changes are OK.
>  >
>  > Jason
>  >

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Jonathan Wakely


On 01/06/17 18:43 +0300, Ville Voutilainen wrote:

On 1 June 2017 at 18:29, Jonathan Wakely  wrote:

They all seem to be shortcuts for something::value, so it seems to me
logical to have
them all be _v.



The _v suffixes in the standard are there to distinguish std::foo from
std::foo_v, but we don't have that problem.


Wouldn't necessarily hurt to follow the same naming convention idea as
the standard, but sure, we
don't have that problem, agreed.


It's not consistent in the standard:

- numeric_limits::is_specialized
- std::chrono::system_clock::is_steady
- std::atomic::is_always_lock_free

And that's OK, because it would be a silly rule that said all boolean
constants should end in _v, it would just be noise.

[PATCH, GCC, ARM/embedded-6-branch] Add mode to probe_stack set operands

2017-06-01 Thread Thomas Preudhomme


Hi,

We have decided to apply the following patch to the embedded-6-branch to fix a 
genrecog warning when processing arm.md.


2017-06-01  Thomas Preud'homme  

Backport from gcc-7-branch
2016-05-09  Kyrylo Tkachov  

* config/arm/arm.md (probe_stack): Add modes to set source
and destination.

Best regards,

Thomas
diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm
index 8cde4f43ee65184c316ab4c7e5b78c5bb0c6e7bb..863cf5483bb8ad086573caf59aed7b095b9b6c09 100644
--- a/gcc/ChangeLog.arm
+++ b/gcc/ChangeLog.arm
@@ -1,5 +1,13 @@
 2017-06-01  Thomas Preud'homme  
 
+	Backport from gcc-7-branch
+	2016-05-09  Kyrylo Tkachov  
+
+	* config/arm/arm.md (probe_stack): Add modes to set source
+	and destination.
+
+2017-06-01  Thomas Preud'homme  
+
 	Backport from mainline
 	2017-05-03  Thomas Preud'homme  
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 7f6914884b080f6c98d88f134c1379818882a05b..37ef1b3a4ad8d3ae8310a0da2d702499766a6828 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -8220,8 +8220,8 @@
 )
 
 (define_insn "probe_stack"
-  [(set (match_operand 0 "memory_operand" "=m")
-(unspec [(const_int 0)] UNSPEC_PROBE_STACK))]
+  [(set (match_operand:SI 0 "memory_operand" "=m")
+(unspec:SI [(const_int 0)] UNSPEC_PROBE_STACK))]
   "TARGET_32BIT"
   "str%?\\tr0, %0"
   [(set_attr "type" "store1")

[PING][PATCH] [i386] Recompute the frame layout less often

2017-06-01 Thread Bernd Edlinger

Ping...

the latest version of this patch was posted here:
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01758.html


Thanks
Bernd.

On 05/23/17 16:31, Bernd Edlinger wrote:
> Hi,
> 
> this is the latest version of my patch.
> 
> As already said, it attempts to compute
> the frame layout only when relevant data have
> changed.
> 
> Apologies for doing more clean-up on Daniel's
> patch than absolutely necessary, but ...
> 
> Bootstrap and reg-tested successfully on
> x86_64-pc-linux-gnu with unix\{,-m32\}.
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Ville Voutilainen

On 1 June 2017 at 19:03, Jonathan Wakely  wrote:
> On 01/06/17 18:43 +0300, Ville Voutilainen wrote:
>>
>> On 1 June 2017 at 18:29, Jonathan Wakely  wrote:

 They all seem to be shortcuts for something::value, so it seems to me
 logical to have
 them all be _v.
>>> The _v suffixes in the standard are there to distinguish std::foo from
>>> std::foo_v, but we don't have that problem.
>> Wouldn't necessarily hurt to follow the same naming convention idea as
>> the standard, but sure, we
>> don't have that problem, agreed.
> It's not consistent in the standard:
> - numeric_limits::is_specialized
> - std::chrono::system_clock::is_steady
> - std::atomic::is_always_lock_free
>
> And that's OK, because it would be a silly rule that said all boolean
> constants should end in _v, it would just be noise.


But I didn't suggest such a rule, merely that if we are doing with a
trait-like variable
that shortcuts a ::value, then we could entertain using _v.

Re: [Patch] Forward triviality in variant

2017-06-01 Thread Jonathan Wakely


On 01/06/17 19:07 +0300, Ville Voutilainen wrote:

On 1 June 2017 at 19:03, Jonathan Wakely  wrote:

On 01/06/17 18:43 +0300, Ville Voutilainen wrote:


On 1 June 2017 at 18:29, Jonathan Wakely  wrote:


They all seem to be shortcuts for something::value, so it seems to me
logical to have
them all be _v.

The _v suffixes in the standard are there to distinguish std::foo from
std::foo_v, but we don't have that problem.

Wouldn't necessarily hurt to follow the same naming convention idea as
the standard, but sure, we
don't have that problem, agreed.

It's not consistent in the standard:
- numeric_limits::is_specialized
- std::chrono::system_clock::is_steady
- std::atomic::is_always_lock_free

And that's OK, because it would be a silly rule that said all boolean
constants should end in _v, it would just be noise.



But I didn't suggest such a rule, merely that if we are doing with a
trait-like variable
that shortcuts a ::value, then we could entertain using _v.


The trait describes properties of the variant. The fact those
properties are determined by something::value is an implementation
detail, not an important feature that needs to be in the name.

The implementation details should not leak into the public API of the
trait.

C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)

2017-06-01 Thread Marek Polacek

A motivating example for this warning can be found e.g. in

  PRE10-C. Wrap multistatement macros in a do-while loop
  https://www.securecoding.cert.org/confluence/x/jgL7

i.e., 

#define SWAP(x, y) \
  tmp = x; \
  x = y; \
  y = tmp

used like this [1]

int x, y, z, tmp;
if (z == 0)
  SWAP(x, y);

expands to the following [2], which is certainly not what the programmer 
intended:

int x, y, z, tmp;
if (z == 0)
  tmp = x;
x = y;
y = tmp;

This has also happened in our codebase, see PR80063.

I tried to summarize the way I approached this problem in the commentary in
warn_for_multiline_expansion, but I'll try to explain the crux of the matter
here, too.

For code like [1], in the FEs we'll see [2], of course.  When parsing the
then-branch we see that the body of the if isn't wrapped in { } so we create a
compound statement with just the first statement "tmp = x;", and the other two
will be executed unconditionally.

My idea was to look at the location info of the following token after the body
of the if has been parsed and determine if they come from the same macro 
expansion,
and if they do (and the if itself doesn't), warn (taking into account various
corner cases, as usually).

For this I had to dive into line_maps, macro maps, etc., so CCing David to check
if my understanding of that is reasonable (hadn't worked with them before).

I've included this warning in -Wall, because there should be no false positives
(fingers crossed) and for most cases the warning should be pretty cheap.

I probably should've added a fix-it hint for good measure, too ("you better wrap
the damn macro in do {} while (0)"), but that can be done as a follow-up.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-06-01  Marek Polacek  

PR c/80116
* c-common.h (warn_for_multiline_expansion): Declare.
* c-warn.c (warn_for_multiline_expansion): New function.
* c.opt (Wmultiline-expansion): New option.

* c-parser.c (c_parser_if_body): Set the location of the
body of the conditional after parsing all the labels.  Call
warn_for_multiline_expansion.
(c_parser_else_body): Likewise.

* parser.c (cp_parser_statement): Add a default argument.  Save the
location of the expression-statement after labels have been parsed.
(cp_parser_implicitly_scoped_statement): Set the location of the
body of the conditional after parsing all the labels.  Call
warn_for_multiline_expansion.

* doc/invoke.texi: Document -Wmultiline-expansion.

* c-c++-common/Wmultiline-expansion-1.c: New test.
* c-c++-common/Wmultiline-expansion-2.c: New test.
* c-c++-common/Wmultiline-expansion-3.c: New test.
* c-c++-common/Wmultiline-expansion-4.c: New test.
* c-c++-common/Wmultiline-expansion-5.c: New test.
* c-c++-common/Wmultiline-expansion-6.c: New test.

diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 79072e6..6efbebc 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -1539,6 +1539,7 @@ extern bool maybe_warn_shift_overflow (location_t, tree, 
tree);
 extern void warn_duplicated_cond_add_or_warn (location_t, tree, vec **);
 extern bool diagnose_mismatched_attributes (tree, tree);
 extern tree do_warn_duplicated_branches_r (tree *, int *, void *);
+extern void warn_for_multiline_expansion (location_t, location_t, location_t);
 
 /* In c-attribs.c.  */
 extern bool attribute_takes_identifier_p (const_tree);
diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index 012675b..16c6fc3 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -2392,3 +2392,105 @@ do_warn_duplicated_branches_r (tree *tp, int *, void *)
 do_warn_duplicated_branches (*tp);
   return NULL_TREE;
 }
+
+/* Implementation of -Wmultiline-expansion.  This warning warns about
+   cases when a macro expands to multiple statements not wrapped in
+   do {} while (0) or ({ }) and is used as a then branch or as an else
+   branch.  For example,
+
+   #define DOIT x++; y++
+
+   if (c)
+ DOIT;
+
+   will increment y unconditionally.
+
+   BODY_LOC is the location of the if/else body, NEXT_LOC is the location
+   of the next token after the if/else body has been parsed, and IF_LOC
+   is the location of the if condition or of the "else" keyword.  */
+
+void
+warn_for_multiline_expansion (location_t body_loc, location_t next_loc,
+ location_t if_loc)
+{
+  if (!warn_multiline_expansion)
+return;
+
+  /* Ain't got time to waste.  We only care about macros here.  */
+  if (!from_macro_expansion_at (body_loc)
+  || !from_macro_expansion_at (next_loc))
+return;
+
+  /* Let's skip macros defined in system headers.  */
+  if (in_system_header_at (body_loc)
+  || in_system_header_at (next_loc))
+return;
+
+  /* Find the actual tokens in the macro definition.  BODY_LOC and
+ NEXT_LOC have to come from the same spelling location, but they
+ will resolve

Re: [PATCH v2, rs6000] Fold vector absolutes in GIMPLE

2017-06-01 Thread Segher Boessenkool

On Wed, May 31, 2017 at 02:38:15PM -0500, Will Schmidt wrote:
> Add support for early expansion of vector absolute built-ins.
> 
> [V2] Per reviews and feedback, skip the early folding for
> integral types based on a check against TYPE_OVERFLOW_WRAPS(arg0).
> 
> Added test variants to exercise the -fwrapv option during
> this folding.
> 
> OK for trunk?  (bootstraps running, pending review).

> +/* flavors of vec_abs. */

Dot space space.

> + if ( INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE(arg0)))
> + && ! TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE(arg0
> +   return false;

No space after ( or !; space before ( in function calls (and macros, etc.)

Please fix those (and consider using contrib/check_GNU_style.py); and
then please commit.  Thanks,


Segher

Re: [PATCH, rs6000] fold vector min/max in GIMPLE

2017-06-01 Thread Segher Boessenkool

On Wed, May 31, 2017 at 03:00:15PM -0500, Will Schmidt wrote:
> OK for trunk?

Looks good, please commit.


Segher

Re: [PATCH] Fix PR80721

2017-06-01 Thread Jonathan Wakely


On 12/05/17 12:10 +0200, Richard Biener wrote:


It was pointed out by Markus that the EH emergency pool is not
kept sorted and fully merged properly for the cases of freeing
an entry before the first free entry and for the cases where
merging with the immediate successor and for the case with
merging with both successor and predecessor is possible.

The following patch attempts to fix this.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Ok for trunk?  (given low / close to no testing coverage
extra close eyes wanted!)

Reporter says maybe it can't happen in real-life as it requires
EH deallocation order not be the reverse of allocation order.
I don't know enough here for a quick guess but "in C++ everything
is possible" ;)


I think it's possible with this testcase:

#include 

int main()
{
 std::exception_ptr p[3];
 for (auto& e : p)
   try {
 throw 1;
   } catch (...) {
 e = std::current_exception();
   }

 p[1] = nullptr;
 p[0] = nullptr;
 p[2] = nullptr;
}

But to test it I had to hack the __cxa_allocate_exception function to
never use malloc and always use the pool, so it's not suitable for the
testsuite.

With current trunk we create three exception objects with addresses,
X, Y, and Z, with the first_free_entry pointing immediately after Z.
The p[1] = nullptr statement frees Y, but after that we still have
first_free_entry pointing after Z, and then its next pointer points to
Y. (So if we threw again at that point, rather than using Y we would
chop off the start of the free block, fragmenting the pool).

When we free p[0] we add X to the list in between the first block and
Y, but don't merge X and Y.

The patch fixes the case above so we end up with a single block again
after all the deallocations.

OK for trunk, although please change s/Slit/Split/ in the comment on
line 165.

RE: [PING][PATCH][Aarch64] Add support for overflow add and sub operations

2017-06-01 Thread Michael Collison

Ping. Testsuite issue resolved. Okay for trunk?

-Original Message-
From: Christophe Lyon [mailto:christophe.l...@linaro.org] 
Sent: Friday, May 19, 2017 3:59 AM
To: Michael Collison 
Cc: gcc-patches@gcc.gnu.org; nd 
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

Hi Michael,


On 19 May 2017 at 07:12, Michael Collison  wrote:
> Hi,
>
> This patch improves code generations for builtin arithmetic overflow 
> operations for the aarch64 backend. As an example for a simple test case such 
> as:
>
> Sure for a simple test case such as:
>
> int
> f (int x, int y, int *ovf)
> {
>   int res;
>   *ovf = __builtin_sadd_overflow (x, y, &res);
>   return res;
> }
>
> Current trunk at -O2 generates
>
> f:
> mov w3, w0
> mov w4, 0
> add w0, w0, w1
> tbnzw1, #31, .L4
> cmp w0, w3
> blt .L3
> .L2:
> str w4, [x2]
> ret
> .p2align 3
> .L4:
> cmp w0, w3
> ble .L2
> .L3:
> mov w4, 1
> b   .L2
>
>
> With the patch this now generates:
>
> f:
> addsw0, w0, w1
> csetw1, vs
> str w1, [x2]
> ret
>
>
> Original patch from Richard Henderson:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>
>
> Okay for trunk?
>
> 2017-05-17  Michael Collison  
> Richard Henderson 
>
> * config/aarch64/aarch64-modes.def (CC_V): New.
> * config/aarch64/aarch64-protos.h
> (aarch64_add_128bit_scratch_regs): Declare
> (aarch64_add_128bit_scratch_regs): Declare.
> (aarch64_expand_subvti): Declare.
> (aarch64_gen_unlikely_cbranch): Declare
> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
> for signed overflow using CC_Vmode.
> (aarch64_get_condition_code_1): Handle CC_Vmode.
> (aarch64_gen_unlikely_cbranch): New function.
> (aarch64_add_128bit_scratch_regs): New function.
> (aarch64_subv_128bit_scratch_regs): New function.
> (aarch64_expand_subvti): New function.
> * config/aarch64/aarch64.md (addv4, uaddv4): New.
> (addti3): Create simpler code if low part is already known to be 0.
> (addvti4, uaddvti4): New.
> (*add3_compareC_cconly_imm): New.
> (*add3_compareC_cconly): New.
> (*add3_compareC_imm): New.
> (*add3_compareC): Rename from add3_compare1; do not
> handle constants within this pattern.
> (*add3_compareV_cconly_imm): New.
> (*add3_compareV_cconly): New.
> (*add3_compareV_imm): New.
> (add3_compareV): New.
> (add3_carryinC, add3_carryinV): New.
> (*add3_carryinC_zero, *add3_carryinV_zero): New.
> (*add3_carryinC, *add3_carryinV): New.
> (subv4, usubv4): New.
> (subti): Handle op1 zero.
> (subvti4, usub4ti4): New.
> (*sub3_compare1_imm): New.
> (sub3_carryinCV): New.
> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
> * testsuite/gcc.target/arm/builtin_usubll.c: New testcase.

I've tried your patch, and 2 of the new tests FAIL:
gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs
gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs

Am I missing something?

Thanks,

Christophe

Re: [PING**3] [PATCH] Force use of absolute path names for gcov

2017-06-01 Thread Nathan Sidwell


On 06/01/2017 11:59 AM, Bernd Edlinger wrote:

Ping...


What are you asking to be reviewed by who?

nathan

--
Nathan Sidwell

Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)

2017-06-01 Thread David Malcolm

On Thu, 2017-06-01 at 18:45 +0200, Marek Polacek wrote:
> A motivating example for this warning can be found e.g. in
> 
>   PRE10-C. Wrap multistatement macros in a do-while loop
>   https://www.securecoding.cert.org/confluence/x/jgL7
> 
> i.e., 
> 
> #define SWAP(x, y) \
>   tmp = x; \
>   x = y; \
>   y = tmp
> 
> used like this [1]
> 
> int x, y, z, tmp;
> if (z == 0)
>   SWAP(x, y);
> 
> expands to the following [2], which is certainly not what the
> programmer intended:
> 
> int x, y, z, tmp;
> if (z == 0)
>   tmp = x;
> x = y;
> y = tmp;
> 
> This has also happened in our codebase, see PR80063.

The warning looks like a good idea.

This reminds me a lot of -Wmisleading-indentation.  Does that fire for
any of the cases?

The patch appears to only consider "if" and "else" clauses.  Shouldn't
it also cover "for", "while" and "do/while"?

> I tried to summarize the way I approached this problem in the
> commentary in
> warn_for_multiline_expansion, but I'll try to explain the crux of the
> matter
> here, too.
> 
> For code like [1], in the FEs we'll see [2], of course.  When parsing
> the
> then-branch we see that the body of the if isn't wrapped in { } so we
> create a
> compound statement with just the first statement "tmp = x;", and the
> other two
> will be executed unconditionally.
> 
> My idea was to look at the location info of the following token after
> the body
> of the if has been parsed and determine if they come from the same
> macro expansion,
> and if they do (and the if itself doesn't), warn (taking into account
> various
> corner cases, as usually).
> 
> For this I had to dive into line_maps, macro maps, etc., so CCing
> David to check
> if my understanding of that is reasonable (hadn't worked with them
> before).

(am looking)

> I've included this warning in -Wall, because there should be no false
> positives
> (fingers crossed) and for most cases the warning should be pretty
> cheap.
> 
> I probably should've added a fix-it hint for good measure, too ("you
> better wrap
> the damn macro in do {} while (0)"), but that can be done as a follow
> -up.

That would be excellent, but might be fiddly.  The fix-it hint
machinery currently "avoids" macros.

See rich_location::reject_impossible_fixit, where we currently reject
source_location (aka location_t) values that are within macro maps,
putting the rich_location into a "something awkward is going on" mode
where it doesn't display fix-it hints.  It ought to work if you're sure
to use locations for the fixit that are within the line_map_ordinary
for the *definition* of the macro - so some care is required.


> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-06-01  Marek Polacek  
> 
>   PR c/80116
>   * c-common.h (warn_for_multiline_expansion): Declare.
>   * c-warn.c (warn_for_multiline_expansion): New function.
>   * c.opt (Wmultiline-expansion): New option.
> 
>   * c-parser.c (c_parser_if_body): Set the location of the
>   body of the conditional after parsing all the labels.  Call
>   warn_for_multiline_expansion.
>   (c_parser_else_body): Likewise.
> 
>   * parser.c (cp_parser_statement): Add a default argument.  Save
> the
>   location of the expression-statement after labels have been
> parsed.
>   (cp_parser_implicitly_scoped_statement): Set the location of
> the
>   body of the conditional after parsing all the labels.  Call
>   warn_for_multiline_expansion.
> 
>   * doc/invoke.texi: Document -Wmultiline-expansion.
> 
>   * c-c++-common/Wmultiline-expansion-1.c: New test.
>   * c-c++-common/Wmultiline-expansion-2.c: New test.
>   * c-c++-common/Wmultiline-expansion-3.c: New test.
>   * c-c++-common/Wmultiline-expansion-4.c: New test.
>   * c-c++-common/Wmultiline-expansion-5.c: New test.
>   * c-c++-common/Wmultiline-expansion-6.c: New test.
> 
> diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
> index 79072e6..6efbebc 100644
> --- gcc/c-family/c-common.h
> +++ gcc/c-family/c-common.h
> @@ -1539,6 +1539,7 @@ extern bool maybe_warn_shift_overflow
> (location_t, tree, tree);
>  extern void warn_duplicated_cond_add_or_warn (location_t, tree,
> vec **);
>  extern bool diagnose_mismatched_attributes (tree, tree);
>  extern tree do_warn_duplicated_branches_r (tree *, int *, void *);
> +extern void warn_for_multiline_expansion (location_t, location_t,
> location_t);
>  
>  /* In c-attribs.c.  */
>  extern bool attribute_takes_identifier_p (const_tree);
> diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
> index 012675b..16c6fc3 100644
> --- gcc/c-family/c-warn.c
> +++ gcc/c-family/c-warn.c
> @@ -2392,3 +2392,105 @@ do_warn_duplicated_branches_r (tree *tp, int
> *, void *)
>  do_warn_duplicated_branches (*tp);
>return NULL_TREE;
>  }
> +
> +/* Implementation of -Wmultiline-expansion.  This warning warns
> about

Is the name of the warning correct?  Shouldn't the warning be about
multiple *statement* macros, rather than m

Re: [PATCH] [i386] Recompute the frame layout less often

2017-06-01 Thread Uros Bizjak

On Tue, May 23, 2017 at 4:31 PM, Bernd Edlinger
 wrote:
> Hi,
>
> this is the latest version of my patch.
>
> As already said, it attempts to compute
> the frame layout only when relevant data have
> changed.
>
> Apologies for doing more clean-up on Daniel's
> patch than absolutely necessary, but ...
>
> Bootstrap and reg-tested successfully on
> x86_64-pc-linux-gnu with unix\{,-m32\}.
> Is it OK for trunk?

It is hard to review a patch that mixes cleanup and functionality changes...

LGTM, so OK for trunk.

Thanks,
Uros.

Re: [PATCH, rs6000] Fold vector logicals (eqv) in GIMPLE

2017-06-01 Thread Segher Boessenkool

On Wed, May 31, 2017 at 03:00:54PM -0500, Will Schmidt wrote:
> Add support for early expansion of vector eqv built-ins.

> OK for trunk?

Yup, looks fine.  Thanks,


Segher


> 2017-05-26  Will Schmidt  
> 
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
>   for early expansion of vec_eqv.
> 
> [gcc/testsuite]
> 
> 2017-05-26  Will Schmidt  
> 
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-char.c: New.
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-float.c: New.
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-floatdouble.c: New.
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-int.c: New.
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-longlong.c: New.
>   * testsuite/gcc.target/powerpc/fold-vec-logical-eqv-short.c: New.

[PATCH] testsuite: ensure GCC_COLORS is unset

2017-06-01 Thread David Malcolm

Dominique noted on IRC that the new test show-template-tree-color.C
(r248698) fails when GCC_COLORS is set in the environment.

The following patch unsets GCC_COLORS within gcc-dg.exp,
fixing this issue.

Successfully regrtested on x86_64-pc-linux-gnu; I also verified
the fix of the failing test by hand with and without GCC_COLORS set.

OK for trunk?

gcc/testsuite/ChangeLog:
* lib/gcc-dg.exp: Ensure GCC_COLORS is unset.
---
 gcc/testsuite/lib/gcc-dg.exp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index b6865b4..e74 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -43,6 +43,12 @@ if { [ishost "*-*-cygwin*"] } {
   setenv LANG C.ASCII
 }
 
+# Ensure GCC_COLORS is unset, for the rare testcases that verify
+# how output is colorized.
+if [info exists ::env(GCC_COLORS) ] {
+unsetenv GCC_COLORS
+}
+
 global GCC_UNDER_TEST
 if ![info exists GCC_UNDER_TEST] {
 set GCC_UNDER_TEST "[find_gcc]"
-- 
1.8.5.3

Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)

2017-06-01 Thread Trevor Saunders

On Thu, Jun 01, 2017 at 06:45:17PM +0200, Marek Polacek wrote:
> A motivating example for this warning can be found e.g. in
> 
>   PRE10-C. Wrap multistatement macros in a do-while loop
>   https://www.securecoding.cert.org/confluence/x/jgL7
> 
> i.e., 
> 
> #define SWAP(x, y) \
>   tmp = x; \
>   x = y; \
>   y = tmp
> 
> used like this [1]
> 
> int x, y, z, tmp;
> if (z == 0)
>   SWAP(x, y);
> 
> expands to the following [2], which is certainly not what the programmer 
> intended:
> 
> int x, y, z, tmp;
> if (z == 0)
>   tmp = x;
> x = y;
> y = tmp;
> 
> This has also happened in our codebase, see PR80063.
> 
> I tried to summarize the way I approached this problem in the commentary in
> warn_for_multiline_expansion, but I'll try to explain the crux of the matter
> here, too.
> 
> For code like [1], in the FEs we'll see [2], of course.  When parsing the
> then-branch we see that the body of the if isn't wrapped in { } so we create a
> compound statement with just the first statement "tmp = x;", and the other two
> will be executed unconditionally.
> 
> My idea was to look at the location info of the following token after the body
> of the if has been parsed and determine if they come from the same macro 
> expansion,
> and if they do (and the if itself doesn't), warn (taking into account various
> corner cases, as usually).

 especially given that its not easy to warn until a questionable macro
 is used dangerously it seems to me like it would be good to allow
 people to be defensive and get warnings for any unbraced blocks.  It
 clearly can't be enabled for gcc, but as the cert link demonstrates
 several common style guides require all blocks to be braced.

 Trev

Re: [PATCH,DWARF,v2] AIX dwarf2out label fix

2017-06-01 Thread Jason Merrill


On 05/18/2017 06:00 AM, David Edelsohn wrote:

This version adds a macro DWARF_INITIAL_LENGTH_SIZE_STR based on
DWARF_OFFSET_SIZE to define the string expression to append to the
label to correct the offset.

Because AIX Assembler inserts the section length, the section label
generated by GCC points to the wrong location and must be adjusted
when referenced in DW_AT_stmt_list.

+  char dl_section_label[MAX_ARTIFICIAL_LABEL_BYTES];


It seems inaccurate to call this variable "label" when it's a label name 
minus offset.  Maybe dl_section_ref?



if (debug_info_level >= DINFO_LEVEL_TERSE)
  add_AT_lineptr (ctnode->root_die, DW_AT_stmt_list,
  (!dwarf_split_debug_info
- ? debug_line_section_label
+ ? dl_section_label
   : debug_skeleton_line_section_label));


Doesn't debug_skeleton_line_section_label need the same offset?

Jason

Re: [PING**3] [PATCH] Force use of absolute path names for gcov

2017-06-01 Thread Bernd Edlinger

On 06/01/17 19:52, Nathan Sidwell wrote:
> On 06/01/2017 11:59 AM, Bernd Edlinger wrote:
>> Ping...
> 
> What are you asking to be reviewed by who?
> 
> nathan
> 

Aehm, sorry.

This is a gcc option that converts relative
path names to absolute ones, so that gcov can
properly merge the line numbers in projects
where different relative path names may refer
to the same source file.


I would like a review from one of gcov maintainers.

I attached the patch again for your convenience.


Thanks
Bernd.
gcc:
2017-04-21  Bernd Edlinger  

* doc/invoke.texi: Document the -fprofile-abs-path option.
* common.opt (fprofile-abs-path): New option.
* gcov-io.h (gcov_write_filename): Declare.
* gcov-io.c (gcov_write_filename): New function.
* coverage.c (coverage_begin_function): Use gcov_write_filename.
* profile.c (output_location): Likewise.

gcc/testsuite:
2017-04-21  Bernd Edlinger  

* gcc.misc-tests/gcov-1a.c: New test.
Index: gcc/common.opt
===
--- gcc/common.opt	(revision 246571)
+++ gcc/common.opt	(working copy)
@@ -1965,6 +1965,10 @@ fprofile
 Common Report Var(profile_flag)
 Enable basic program profiling code.
 
+fprofile-abs-path
+Common Report Var(profile_abs_path_flag)
+Generate absolute source path names for gcov.
+
 fprofile-arcs
 Common Report Var(profile_arc_flag)
 Insert arc-based program profiling code.
Index: gcc/coverage.c
===
--- gcc/coverage.c	(revision 246571)
+++ gcc/coverage.c	(working copy)
@@ -663,7 +663,7 @@ coverage_begin_function (unsigned lineno_checksum,
   gcov_write_unsigned (cfg_checksum);
   gcov_write_string (IDENTIFIER_POINTER
 		 (DECL_ASSEMBLER_NAME (current_function_decl)));
-  gcov_write_string (xloc.file);
+  gcov_write_filename (xloc.file);
   gcov_write_unsigned (xloc.line);
   gcov_write_length (offset);
 
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 246571)
+++ gcc/doc/invoke.texi	(working copy)
@@ -441,6 +441,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Program Instrumentation Options
 @xref{Instrumentation Options,,Program Instrumentation Options}.
 @gccoptlist{-p  -pg  -fprofile-arcs  --coverage  -ftest-coverage @gol
+-fprofile-abs-path @gol
 -fprofile-dir=@var{path}  -fprofile-generate  -fprofile-generate=@var{path} @gol
 -fsanitize=@var{style}  -fsanitize-recover  -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number}  -fsanitize-sections=@var{s1},@var{s2},... @gol
@@ -10639,6 +10640,12 @@ additional @option{-ftest-coverage} option.  You d
 every source file in a program.
 
 @item
+Compile the source files additionally with @option{-fprofile-abs-path}
+to create absolute path names in the @file{.gcno} files.  This allows
+@command{gcov} to find the correct sources in projects with multiple
+directories.
+
+@item
 Link your object files with @option{-lgcov} or @option{-fprofile-arcs}
 (the latter implies the former).
 
@@ -10696,6 +10713,12 @@
 generate test coverage data.  Coverage data matches the source files
 more closely if you do not optimize.
 
+@item -fprofile-abs-path
+@opindex fprofile-abs-path
+Automatically convert relative source file names to absolute path names
+in the @file{.gcno} files.  This allows @command{gcov} to find the correct
+sources in projects with multiple directories.
+
 @item -fprofile-dir=@var{path}
 @opindex fprofile-dir
 
Index: gcc/gcov-io.c
===
--- gcc/gcov-io.c	(revision 246571)
+++ gcc/gcov-io.c	(working copy)
@@ -353,6 +353,37 @@ gcov_write_string (const char *string)
 #endif
 
 #if !IN_LIBGCOV
+/* Write FILENAME to coverage file.  Sets error flag on file
+   error, overflow flag on overflow */
+
+GCOV_LINKAGE void
+gcov_write_filename (const char *filename)
+{
+  char buf[1024];
+  size_t len;
+
+  if (profile_abs_path_flag && filename && filename[0]
+  && !(IS_DIR_SEPARATOR (filename[0])
+#if HAVE_DOS_BASED_FILE_SYSTEM
+	   || filename[1] == ':'
+#endif
+	  )
+  && (len = strlen (filename)) < sizeof (buf) - 1)
+{
+  if (getcwd (buf, sizeof (buf) - len - 1) != NULL)
+	{
+	  if (buf[0] && !IS_DIR_SEPARATOR (buf[strlen (buf) - 1]))
+	strcat (buf, "/");
+	  strcat (buf, filename);
+	  filename = buf;
+	}
+}
+
+  return gcov_write_string (filename);
+}
+#endif
+
+#if !IN_LIBGCOV
 /* Write a tag TAG and reserve space for the record length. Return a
value to be used for gcov_write_length.  */
 
Index: gcc/gcov-io.h
===
--- gcc/gcov-io.h	(revision 246571)
+++ gcc/gcov-io.h	(working copy)
@@ -388,6 +388,7 @@ GCOV_LINKAGE void gcov_write_unsigned (gcov_unsign
 /* Available only in compiler */
 GCOV_LINKAGE unsigned gcov_histo_index (gcov_type value);
 GCOV_LINKAGE void gcov_write_string (const

Re: [PATCH,DWARF,v2] AIX dwarf2out label fix

2017-06-01 Thread David Edelsohn

On Thu, Jun 1, 2017 at 3:03 PM, Jason Merrill  wrote:
> On 05/18/2017 06:00 AM, David Edelsohn wrote:
>>
>> This version adds a macro DWARF_INITIAL_LENGTH_SIZE_STR based on
>> DWARF_OFFSET_SIZE to define the string expression to append to the
>> label to correct the offset.
>>
>> Because AIX Assembler inserts the section length, the section label
>> generated by GCC points to the wrong location and must be adjusted
>> when referenced in DW_AT_stmt_list.
>>
>> +  char dl_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
>
> It seems inaccurate to call this variable "label" when it's a label name
> minus offset.  Maybe dl_section_ref?

Hi, Jason

Thanks for taking a look at this!

Any naming suggestions are appreciated -- I was trying to choose a
short variable name.  dl_section_ref is fine with me.

>
>> if (debug_info_level >= DINFO_LEVEL_TERSE)
>>   add_AT_lineptr (ctnode->root_die, DW_AT_stmt_list,
>>   (!dwarf_split_debug_info
>> - ? debug_line_section_label
>> + ? dl_section_label
>>: debug_skeleton_line_section_label));
>
>
> Doesn't debug_skeleton_line_section_label need the same offset?

AIX doesn't support DWARF split debug info, so it did not seem
worthwhile to clutter the code.  I am trying to make the minimal
changes for AIX's peculiar DWARF implementation.

Thanks, David

Re: [PATCH] testsuite: ensure GCC_COLORS is unset

2017-06-01 Thread Mike Stump

On Jun 1, 2017, at 11:59 AM, David Malcolm  wrote:
> 
> The following patch unsets GCC_COLORS within gcc-dg.exp,
> fixing this issue.

> OK for trunk?

Ok.

[PATCH] Backport x86 intrin parameter/var uglification patch to 6.x

2017-06-01 Thread Jakub Jelinek

Hi!

Jon mentioned on IRC somebody complained about namespace pollution of the
STL headers (that include x86 intrin headers).  This patch backports
r239617 to 6.x, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to gcc-6-branch.  The patch applies cleanly to gcc-5-branch if
the pkuintrin.h hunk is removed, I'll bootstrap/regtest it next and commit
there if it succeeds.

2017-06-01  Jakub Jelinek  

Backported from mainline
2016-08-19  Jakub Jelinek  

* config/i386/rdseedintrin.h (_rdseed16_step, _rdseed32_step,
_rdseed64_step): Uglify argument names and/or local variable names
in inline functions.
* config/i386/rtmintrin.h (_xabort): Likewise.
* config/i386/avx512vlintrin.h (_mm256_ternarylogic_epi64,
_mm256_mask_ternarylogic_epi64, _mm256_maskz_ternarylogic_epi64,
_mm256_ternarylogic_epi32, _mm256_mask_ternarylogic_epi32,
_mm256_maskz_ternarylogic_epi32, _mm_ternarylogic_epi64,
_mm_mask_ternarylogic_epi64, _mm_maskz_ternarylogic_epi64,
_mm_ternarylogic_epi32, _mm_mask_ternarylogic_epi32,
_mm_maskz_ternarylogic_epi32): Likewise.
* config/i386/lwpintrin.h (__llwpcb, __lwpval32, __lwpval64,
__lwpins32, __lwpins64): Likewise.
* config/i386/avx2intrin.h (_mm_i32gather_pd, _mm_mask_i32gather_pd,
_mm256_i32gather_pd, _mm256_mask_i32gather_pd, _mm_i64gather_pd,
_mm_mask_i64gather_pd, _mm256_i64gather_pd, _mm256_mask_i64gather_pd,
_mm_i32gather_ps, _mm_mask_i32gather_ps, _mm256_i32gather_ps,
_mm256_mask_i32gather_ps, _mm_i64gather_ps, _mm_mask_i64gather_ps,
_mm256_i64gather_ps, _mm256_mask_i64gather_ps, _mm_i32gather_epi64,
_mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
_mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
_mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
_mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
_mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
_mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
_mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
_mm256_mask_i64gather_epi32): Likewise.
* config/i386/pmm_malloc.h (_mm_malloc, _mm_free): Likewise.
* config/i386/ia32intrin.h (__writeeflags): Likewise.
* config/i386/pkuintrin.h (_wrpkru): Likewise.
* config/i386/avx512pfintrin.h (_mm512_mask_prefetch_i32gather_pd,
_mm512_mask_prefetch_i32gather_ps, _mm512_mask_prefetch_i64gather_pd,
_mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
_mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
_mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
_mm512_mask_prefetch_i64scatter_ps): Likewise.
* config/i386/gmm_malloc.h (_mm_malloc, _mm_free): Likewise.
* config/i386/avx512fintrin.h (_mm512_ternarylogic_epi64,
_mm512_mask_ternarylogic_epi64, _mm512_maskz_ternarylogic_epi64,
_mm512_ternarylogic_epi32, _mm512_mask_ternarylogic_epi32,
_mm512_maskz_ternarylogic_epi32, _mm512_i32gather_ps,
_mm512_mask_i32gather_ps, _mm512_i32gather_pd, _mm512_i64gather_ps,
_mm512_i64gather_pd, _mm512_i32gather_epi32, _mm512_i32gather_epi64,
_mm512_i64gather_epi32, _mm512_i64gather_epi64): Likewise.

--- gcc/config/i386/avx2intrin.h.jj 2016-04-15 16:55:17.0 +0200
+++ gcc/config/i386/avx2intrin.h2017-06-01 18:56:27.127297025 +0200
@@ -1246,422 +1246,426 @@ _mm_srlv_epi64 (__m128i __X, __m128i __Y
 #ifdef __OPTIMIZE__
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_i32gather_pd (double const *base, __m128i index, const int scale)
+_mm_i32gather_pd (double const *__base, __m128i __index, const int __scale)
 {
-  __v2df zero = _mm_setzero_pd ();
-  __v2df mask = _mm_cmpeq_pd (zero, zero);
+  __v2df __zero = _mm_setzero_pd ();
+  __v2df __mask = _mm_cmpeq_pd (__zero, __zero);
 
   return (__m128d) __builtin_ia32_gathersiv2df (_mm_undefined_pd (),
-   base,
-   (__v4si)index,
-   mask,
-   scale);
+   __base,
+   (__v4si)__index,
+   __mask,
+   __scale);
 }
 
 extern __inline __m128d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_mask_i32gather_pd (__m128d src, double const *base, __m128i index,
-  __m128d mask, const int scale)
+_mm_mask_i32gather_pd (__m128d __src, double const *__base, __m128i __index,
+  __m128d __mask, const int __scale)
 {
-  return (__m1

Re: [PATCH,DWARF,v2] AIX dwarf2out label fix

2017-06-01 Thread Jason Merrill

On Thu, Jun 1, 2017 at 3:27 PM, David Edelsohn  wrote:
> On Thu, Jun 1, 2017 at 3:03 PM, Jason Merrill  wrote:
>> On 05/18/2017 06:00 AM, David Edelsohn wrote:
>>>
>>> This version adds a macro DWARF_INITIAL_LENGTH_SIZE_STR based on
>>> DWARF_OFFSET_SIZE to define the string expression to append to the
>>> label to correct the offset.
>>>
>>> Because AIX Assembler inserts the section length, the section label
>>> generated by GCC points to the wrong location and must be adjusted
>>> when referenced in DW_AT_stmt_list.
>>>
>>> +  char dl_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
>>
>> It seems inaccurate to call this variable "label" when it's a label name
>> minus offset.  Maybe dl_section_ref?
>
> Hi, Jason
>
> Thanks for taking a look at this!
>
> Any naming suggestions are appreciated -- I was trying to choose a
> short variable name.  dl_section_ref is fine with me.

Let's go with that, then.  OK with that change.

Jason

Re: [PATCH] Add attribute((target_clone(...))) to PowerPC

2017-06-01 Thread Segher Boessenkool

Hi Mike,

On Wed, May 31, 2017 at 06:33:37PM -0400, Michael Meissner wrote:
> +/* On PowerPC, we have a limited number of target clones that we care about
> +   which means we can use an array to hold the options, rather than having 
> more
> +   elaborate data structures to identify each possible variation.  Order the
> +   clones from the default to the highest ISA.  */
> +const int CLONE_DEFAULT  = 0;/* default clone.  */
> +const int CLONE_ISA_2_05 = 1;/* ISA 2.05 (power6).  */
> +const int CLONE_ISA_2_06 = 2;/* ISA 2.06 (power7).  */
> +const int CLONE_ISA_2_07 = 3;/* ISA 2.07 (power8).  */
> +const int CLONE_ISA_3_00 = 4;/* ISA 3.00 (power9).  */
> +const int CLONE_MAX  = 5;

With "you don't have to give the enum a name" I meant write it as

enum {
  CLONE_DEFAULT = 0,
  CLONE_ISA_2_05,
[...]
  CLONE_MASK
};

If you do "const int", I think it should be "static const int"?

> +/* Helper function for printing the function name when debugging.  */
> +
> +static const char *
> +get_decl_name (tree fn)
> +{
> +  tree name;
> +
> +  if (!fn)
> +return "";
> +
> +  name = DECL_NAME (fn);
> +  if (!name)
> +return "";
> +
> +  return IDENTIFIER_POINTER (name);
> +}

Perhaps this would be useful to have in generic code?

> +rs6000_clone_priority (tree fndecl)
> +{
> +  tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
> +  HOST_WIDE_INT isa_masks;
> +  int ret = (int) CLONE_DEFAULT;

You don't need this cast afaics.

> +  tree attrs = lookup_attribute ("target", DECL_ATTRIBUTES (fndecl));
> +  const char *attrs_str = NULL;
> +
> +  gcc_assert (attrs != NULL);
> +  attrs = TREE_VALUE (TREE_VALUE (attrs));
> +
> +  gcc_assert (TREE_CODE (attrs) == STRING_CST);
> +  attrs_str = TREE_STRING_POINTER (attrs);

And these asserts neither.  There are more of these: if the code
immediately following an assert will obviously fail (in an obvious way)
if the assert is false, then the assert is just noise, makes reading
the code harder instead of easier.

> +/* This compares the priority of target features in function DECL1 and DECL2.
> +   It returns positive value if DECL1 is higher priority, negative value if
> +   DECL2 is higher priority and 0 if they are the same.  Note, priorities are
> +   ordered from lowest (currently CLONE_ISA_3_0) to highest
> +   (CLONE_DEFAULT).  */

This comment needs updating?  Swap CLONE_ISA_3_0 with CLONE_DEFAULT?

> +#if defined (ASM_OUTPUT_TYPE_DIRECTIVE)
> +  if (targetm.has_ifunc_p ())

Hrm, I still don't see what you need the #ifdef for.  What in the
following code won't compile without it?  Or does targetm.has_ifunc_p
return the wrong answer?

> +{
> +  struct cgraph_function_version_info *it_v = NULL;
> +  struct cgraph_node *dispatcher_node = NULL;
> +  struct cgraph_function_version_info *dispatcher_version_info = NULL;
> +
> +  /* Right now, the dispatching is done via ifunc.  */
> +  dispatch_decl = make_dispatcher_decl (default_node->decl);
> +
> +  dispatcher_node = cgraph_node::get_create (dispatch_decl);
> +  gcc_assert (dispatcher_node != NULL);
> +  dispatcher_node->dispatcher_function = 1;
> +  dispatcher_version_info
> + = dispatcher_node->insert_new_function_version ();
> +  dispatcher_version_info->next = default_version_info;
> +  dispatcher_node->definition = 1;
> +
> +  /* Set the dispatcher for all the versions.  */
> +  it_v = default_version_info;
> +  while (it_v != NULL)
> + {
> +   it_v->dispatcher_resolver = dispatch_decl;
> +   it_v = it_v->next;
> + }
> +}
> +  else
> +#endif

> +  /* On the PowerPC, we do not need to call __builtin_cpu_init, which is a 
> NOP
> + on the PowerPC (on the x86_64, it is not a NOP).  The builtin function
> + __builtin_cpu_support ensures that the TOC fields are setup by 
> requiring a
> + recent glibc.  If we ever need to call __builtin_cpu_init, we would need
> + to insert the code here to do the call.  */

Ah cool, thanks :-)


Segher

Re: Default std::vector default and move constructor

2017-06-01 Thread François Dumont


On 01/06/2017 15:34, Jonathan Wakely wrote:

On 31/05/17 22:28 +0200, François Dumont wrote:
Unless I made a mistake it revealed that restoring explicit call to 
_Bit_alloc_type() in default constructor was not enough. G++ doesn't 
transform it into a value-init if needed. I don't know if it is a 
compiler bug but I had to do just like presented in the Standard to 
achieve the expected behavior.


That really shouldn't be necessary (see blow).

This value-init is specific to post-C++11 right ? Maybe I could 
remove the useless explicit call to _Bit_alloc_type() in pre-C++11 
mode ?


No, because C++03 also requires the allocator to be value-initialized.


Ok so I'll try to make the test C++03 compatible.




Now I wonder if I really introduced a regression in rb_tree...


Yes, I think you did. Could you try to verify that using the new
default_init_allocator?


I did and for the moment I experiment the same result with rb_tree than 
the one I am having with std::vector, strange.


I plan to add this test to all containers.





+  struct _Bvector_impl
+: public _Bit_alloc_type, public _Bvector_impl_data
+{
+public:
+#if __cplusplus >= 201103L
+  _Bvector_impl()
+noexcept( noexcept(_Bit_alloc_type())
+  && noexcept(_Bvector_impl(declval_Bit_alloc_type&>())) )


This second condition is not needed, because that constructor should
be noexcept (see below).


+  : _Bvector_impl(_Bit_alloc_type())


This should not be necessary...


+  { }
+#else
  _Bvector_impl()
-: _Bit_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
+  : _Bit_alloc_type()
  { }
+#endif


I would expect the constructor to look like this:

  _Bvector_impl()
  _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type()) )
 : _Bit_alloc_type()
 { }

What happens when you do that?


This is what I tried first and test was then failing. It surprised me too.




  _Bvector_impl(const _Bit_alloc_type& __a)
-: _Bit_alloc_type(__a), _M_start(), _M_finish(), 
_M_end_of_storage()

+ _GLIBCXX_NOEXCEPT_IF( noexcept(_Bit_alloc_type(__a)) )


Copying the allocator is not allowed to throw. You can use simply
_GLIBCXX_NOEXCEPT here.



+void test01()
+{
+  typedef default_init_allocator alloc_type;
+  typedef std::vector test_type;
+
+  test_type v1;
+  v1.push_back(T());
+
+  VERIFY( !v1.empty() );
+  VERIFY( !v1.get_allocator().state );


This is unlikely to ever fail, because the stack is probably full of
zeros anyway. Did you confirm whether the test fails without your
fixes to value-initialize the allocator?
Yes, the test is failing as soon as I use the default constructor just 
calling the allocator default constructor in its initialization list or 
when I default this implementation.


One possible way to make it fail would be to construct the
vector using placement new, into a buffer filled with non-zero
values. (Valgrind or a sanitizer should also tell us, but we can't
rely on them in the testsuite).

This is what I have implemented in this new proposal also considering 
your other remarks. For the moment if the test fail there is a memory 
leak but I prefer to keep implementation simple.


I also start runing the test on the normal std::vector implementation 
and I never managed to make the test fail. Even when I default all 
default constructor implementations !


I started rebuilding everything.

François

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 78195c1..5fb342f 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -388,10 +388,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   { return __x + __n; }
 
   inline void
-  __fill_bvector(_Bit_iterator __first, _Bit_iterator __last, bool __x)
+  __fill_bvector(_Bit_type * __v,
+		 unsigned int __first, unsigned int __last, bool __x)
   {
-for (; __first != __last; ++__first)
-  *__first = __x;
+const _Bit_type __fmask = ~0ul << __first;
+const _Bit_type __lmask = ~0ul >> (_S_word_bit - __last);
+const _Bit_type __mask = __fmask & __lmask;
+
+if (__x)
+  *__v |= __mask;
+else
+  *__v &= ~__mask;
   }
 
   inline void
@@ -399,12 +406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 if (__first._M_p != __last._M_p)
   {
-	std::fill(__first._M_p + 1, __last._M_p, __x ? ~0 : 0);
-	__fill_bvector(__first, _Bit_iterator(__first._M_p + 1, 0), __x);
-	__fill_bvector(_Bit_iterator(__last._M_p, 0), __last, __x);
+	_Bit_type *__first_p = __first._M_p;
+	if (__first._M_offset != 0)
+	  __fill_bvector(__first_p++, __first._M_offset, _S_word_bit, __x);
+
+	__builtin_memset(__first_p, __x ? ~0 : 0,
+			 (__last._M_p - __first_p) * sizeof(_Bit_type));
+
+	if (__last._M_offset != 0)
+	  __fill_bvector(__last._M_p, 0, __last._M_offset, __x);
   }
 else
-  __fill_bvector(__first, __last, __x);
+  __fill_bvector(__first._M_p, __first._M_offset, __last._M_offset, _

[PATCH] handle bzero/bcopy in DSE and aliasing (PR 80933, 80934)

2017-06-01 Thread Martin Sebor


While testing some otherwise unrelated enhancements in these areas
I noticed that calls to bzero and bcopy are not being handled as
efficiently as equivalent calls to memset and memcpy.  Specifically,
redundant calls are not eliminated and the functions appear to be
treated as if they allowed their pointer arguments to escape.  This
turned out to be due to the missing handling of the former two built
ins by the DSE and aliasing passes.

The attached patch adds this handling so the cases I noted in the two
PRs are now handled.

Tested on x86_64-linux.

Martin
PR tree-optimization/80934 - bzero should be assumed not to escape pointer argument
PR tree-optimization/80933 - redundant bzero/bcopy calls not eliminated

gcc/ChangeLog:

	PR tree-optimization/80933
	* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Handle bzero.
	(call_may_clobber_ref_p_1): Likewise.
	(stmt_kills_ref_p): Likewise.
	* tree-ssa-dse.c (initialize_ao_ref_for_dse): Handle bcopy and bzero.
	(decrement_count): Add an argument.
	(maybe_trim_memstar_call): Handle bcopy.
	(dse_dom_walker::dse_optimize_stmt): Likewise.
	* tree-ssa-sccvn.c (vn_reference_lookup_3): Handle bzero.
	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call): Likewise.
	(find_func_clobbers): Likewise.

gcc/testsuite/ChangeLog:

	PR tree-optimization/80933
	* gcc.dg/tree-ssa/ssa-dse-30.c: New test.
	* gcc.dg/tree-ssa/alias-36.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-36.c b/gcc/testsuite/gcc.dg/tree-ssa/alias-36.c
new file mode 100644
index 000..61b601a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-36.c
@@ -0,0 +1,28 @@
+/* PR tree-optimization/80934 - bzero should be assumed not to escape
+   pointer argument
+   { dg-do compile }
+   { dg-options "-O2 -fdump-tree-alias" } */
+
+void foobar (void);
+
+void f (void);
+
+void g (void)
+{
+  char d[32];
+  __builtin_memset (d, 0, sizeof d);
+  f ();
+  if (*d != 0)
+foobar ();
+}
+
+void h (void)
+{
+  char d[32];
+  __builtin_bzero (d, sizeof d);
+  f ();
+  if (*d != 0)
+foobar ();
+}
+
+/* { dg-final { scan-tree-dump-not "memset|foobar|bzero" "alias" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-30.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-30.c
new file mode 100644
index 000..9d2c920
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-30.c
@@ -0,0 +1,30 @@
+/* PR tree-optimization/80933 - redundant bzero/bcopy calls not eliminated
+   { dg-do compile }
+   { dg-options "-O2 -fdump-tree-dse1" } */
+
+void sink (void*);
+
+void test_bcopy (const void *s)
+{
+  char d[33];
+
+  /* The bcopy calls are expanded inline in EVRP, before DSE runs,
+ so this test doesn't actually verify that DSE does its job.  */
+  __builtin_bcopy (s, d, sizeof d);
+  __builtin_bcopy (s, d, sizeof d);
+
+  sink (d);
+}
+
+void test_bzero (void)
+{
+  char d[33];
+
+  __builtin_bzero (d, sizeof d);
+  __builtin_bzero (d, sizeof d);
+
+  sink (d);
+}
+
+/* { dg-final { scan-tree-dump-times "builtin_bzero" 1 "dse1" } } */
+/* { dg-final { scan-tree-dump-not "builtin_bcopy" "dse1" } } */
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 74ee2b0..8c4b289f 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -1783,6 +1783,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref)
 	case BUILT_IN_ALLOCA_WITH_ALIGN:
 	case BUILT_IN_STACK_SAVE:
 	case BUILT_IN_STACK_RESTORE:
+	case BUILT_IN_BZERO:
 	case BUILT_IN_MEMSET:
 	case BUILT_IN_TM_MEMSET:
 	case BUILT_IN_MEMSET_CHK:
@@ -2023,6 +2024,9 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
 
   callee = gimple_call_fndecl (call);
 
+  /* The number of the size argument to one of the built-in functions
+ below.  */
+  unsigned sizargno = 2;
   /* Handle those builtin functions explicitly that do not act as
  escape points.  See tree-ssa-structalias.c:find_func_aliases
  for the list of builtins we might need to handle here.  */
@@ -2030,8 +2034,13 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
   && gimple_call_builtin_p (call, BUILT_IN_NORMAL))
 switch (DECL_FUNCTION_CODE (callee))
   {
-	/* All the following functions clobber memory pointed to by
-	   their first argument.  */
+case BUILT_IN_BZERO:
+	  sizargno = 1;
+	  /* Fall through.  */
+
+	  /* With the exception of the bzero function above, all of
+	 the following clobber memory pointed to by their first
+	 argument.  */
 	case BUILT_IN_STRCPY:
 	case BUILT_IN_STRNCPY:
 	case BUILT_IN_MEMCPY:
@@ -2062,9 +2071,9 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref)
 	   is strlen (dest) + n + 1 instead of n, resp.
 	   n + 1 at dest + strlen (dest), but strlen (dest) isn't
 	   known.  */
-	if (gimple_call_num_args (call) == 3
+	if (gimple_call_num_args (call) > sizargno
 		&& DECL_FUNCTION_CODE (callee) != BUILT_IN_STRNCAT)
-	  size = gimple_call_arg (call, 2);
+	  size = gimple_call_arg (call, sizargno);
 	ao_ref_init_from_ptr_and_size (&dref,
 	   gimple_c

Re: [PATCH, AArch64] Add x86 intrinsic headers to GCC AArch64 taget

2017-06-01 Thread Joseph Myers

On Mon, 29 May 2017, Hurugalawadi, Naveen wrote:

> Hi,
> 
> Please find attached the patch that adds first set of X86 instrinsic
> headers to AArch64 target.
> The implementation is based on similar work targeted at PPC64LE.
> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00550.html
> 
> We are using the corresponding DejaGnu tests similar to Powerpc from 
> gcc/testsuite/gcc.target/i386/ to gcc/testsuite/gcc.target/aarch64 as the
> source remains same. Only modifications are target related as appropriate.

Where intrinsics can be implemented in plain GNU C without 
architecture-specific built-in functions being involved in the 
implementation, it would seem to me to be a bad idea to duplicate the 
implementation for more and more architectures.

Rather, it would seem better to refactor the powerpc implementation into a 
part that's shared by all architectures (maybe even by x86 as well) and 
possibly architecture-specific pieces (though if multiple architectures 
have architecture-specific built-in functions for the same thing, an 
architecture-independent built-in function might make sense).  For "all 
architectures" read "all little-endian architectures" for any intrinsics 
whose definition is problematic for big endian.  Likewise, tests should 
then be shared.

(I realise that there may be issues with type sizes as well, making some 
intrinsics problematic with e.g. 16-bit int.  But implementations shared 
for all architectures with 32-bit int and 64-bit long long, for example, 
would still make sense.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH v3, rs6000] gcc mainline, add builtin support for vec_doublee, vec_doubleo, vec_doublel builtins

2017-06-01 Thread Carl E. Love


GCC Maintainers:

This is version 3 of the patch to add support for the various
vec_doublee, vec_doubleo, vec_doublel, vec_doubleh builtin-ins. I have
addressed the formatting comments from Segher on version 2.  I have run
the patch through contrib/check_GNU_style.sh to check for obvious
formatting errors.  

The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
with no regressions.

Is the patch OK for gcc mainline?

  Carl Love


Add vec_doublee, vec_doubleo, vec_doublel, vec_doubleh builtin-ins

gcc/ChangeLog:

2017-06-01  Carl Love  

   * config/rs6000/rs6000-c: Add support for built-in functions
   vector double vec_doublee (vector signed int);
   vector double vec_doublee (vector unsigned int);
   vector double vec_doublee (vector float);
   vector double vec_doubleh (vector signed int);
   vector double vec_doubleh (vector unsigned int);
   vector double vec_doubleh (vector float);
   vector double vec_doublel (vector signed int);
   vector double vec_doublel (vector unsigned int);
   vector double vec_doublel (vector float);
   vector double vec_doubleo (vector signed int);
   vector double vec_doubleo (vector unsigned int);
   vector double vec_doubleo (vector float);.
   * config/rs6000/rs6000-builtin.def: Add definitions for DOUBLEE,
   DOUBLEO, DOUBLEH, DOUBLEL, UNS_DOUBLEO, UNS_DOUBLEE, UNS_DOUBLEH,
   UNS_DOUBLEL.
   * config/rs6000/altivec.md: Add code generator for doublee2,
   unsdoubleev4si2, doubleo2, unsdoubleov4si2, doubleh2,
   unsdoublehv4si2, doublel2, unsdoublelv4si2, add mode attribute
   VS_sxwsp.
   * config/rs6000/altivec.h: Add define for vec_doublee, vec_doubleo,
   vec_doublel, vec_doubleh.
   * doc/extend.texi: Update the built-in documentation file for the
   new built-in functions.

2017-06-01  Carl Love  

gcc/testsuite/ChangeLog:

   * gcc.target/powerpc/builtins-3-runnable.c: New file of runnable tests
   for the new built-ins.

Signed-off-by: Carl Love 
---
 gcc/config/rs6000/altivec.h|   4 +
 gcc/config/rs6000/altivec.md   | 337 +
 gcc/config/rs6000/rs6000-builtin.def   |  21 ++
 gcc/config/rs6000/rs6000-c.c   |  29 ++
 gcc/doc/extend.texi|  16 +
 .../gcc.target/powerpc/builtins-3-runnable.c   |  83 +
 6 files changed, 490 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index c92bcce..20050eb 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -128,6 +128,10 @@
 #define vec_ctu __builtin_vec_ctu
 #define vec_cpsgn __builtin_vec_copysign
 #define vec_double __builtin_vec_double
+#define vec_doublee __builtin_vec_doublee
+#define vec_doubleo __builtin_vec_doubleo
+#define vec_doublel __builtin_vec_doublel
+#define vec_doubleh __builtin_vec_doubleh
 #define vec_expte __builtin_vec_expte
 #define vec_floor __builtin_vec_floor
 #define vec_loge __builtin_vec_loge
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 649f181..af1fae3 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -203,6 +203,10 @@
   (KF "FLOAT128_VECTOR_P (KFmode)")
   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
+;; Map the Vector convert single precision to double precision for integer
+;; versus floating point
+(define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
+
 ;; Specific iterator for parity which does not have a byte/half-word form, but
 ;; does have a quad word form
 (define_mode_iterator VParity [V4SI
@@ -2739,6 +2743,339 @@
   "stvewx %1,%y0"
   [(set_attr "type" "vecstore")])
 
+;; Generate doublee
+;; signed int/float to double convert words 0 and 2
+(define_expand "doublee2"
+  [(set (match_operand:V2DF 0 "register_operand" "=v")
+   (match_operand:VSX_W 1 "register_operand" "v"))]
+  "TARGET_VSX"
+{
+   machine_mode op_mode = GET_MODE (operands[1]);
+
+   if (VECTOR_ELT_ORDER_BIG)
+ {
+   /* Big endian word numbering for words in operand is 0 1 2 3.
+  Input words 0 and 2 are where they need to be.  */
+   emit_insn (gen_vsx_xvcvdp (operands[0], operands[1]));
+ }
+   else
+ {
+   /* Little endian word numbering for operand is 3 2 1 0.
+  take (operand[1] operand[1]) and shift left one word
+  3 2 1 03 2 1 0  =>  2 1 0 3
+  Input words 2 and 0 are now where they need to be for the
+  conversion.  */
+   rtx rtx_tmp;
+   rtx rtx_val = GEN_INT (1);
+
+   rtx_tmp = gen_reg_rtx (op_mode);
+   emit_insn (gen_vsx_xxsldwi_ (rtx_tmp, operands[1],
+  operands[1], rtx_val));
+   emit_insn (gen_vsx_xvcvdp (operands[0], rtx_tmp));
+ }
+   DONE;
+}
+  [(set_attr "type" "veccomplex")])
+
+;; Generate unsdoublee
+;; un

Re: [PING**2] [PATCH] Implement a warning for bogus sizeof(pointer) / sizeof(pointer[0])

2017-06-01 Thread Joseph Myers

The C changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)

2017-06-01 Thread Joseph Myers

On Thu, 1 Jun 2017, David Malcolm wrote:

> The patch appears to only consider "if" and "else" clauses.  Shouldn't
> it also cover "for", "while" and "do/while"?

do/while would normally get a syntax error in the problem cases.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: C/C++ PATCH to implement -Wmultiline-expansion (PR c/80116)

2017-06-01 Thread Martin Sebor


On 06/01/2017 10:45 AM, Marek Polacek wrote:

A motivating example for this warning can be found e.g. in

  PRE10-C. Wrap multistatement macros in a do-while loop
  https://www.securecoding.cert.org/confluence/x/jgL7

i.e.,

#define SWAP(x, y) \
  tmp = x; \
  x = y; \
  y = tmp

used like this [1]

int x, y, z, tmp;
if (z == 0)
  SWAP(x, y);

expands to the following [2], which is certainly not what the programmer 
intended:

int x, y, z, tmp;
if (z == 0)
  tmp = x;
x = y;
y = tmp;

This has also happened in our codebase, see PR80063.

I tried to summarize the way I approached this problem in the commentary in
warn_for_multiline_expansion, but I'll try to explain the crux of the matter
here, too.

For code like [1], in the FEs we'll see [2], of course.  When parsing the
then-branch we see that the body of the if isn't wrapped in { } so we create a
compound statement with just the first statement "tmp = x;", and the other two
will be executed unconditionally.

My idea was to look at the location info of the following token after the body
of the if has been parsed and determine if they come from the same macro 
expansion,
and if they do (and the if itself doesn't), warn (taking into account various
corner cases, as usually).


Very nice.  I think David already suggested handling other statements
besides if (do/while), so let me just add for and switch (as in:
'switch (1) case SWAP (i, j);')

The location in the warning look like it could be improved to extend
from just the first column to the whole macro argument but I don't
suppose that's under the direct control of your patch.

Besides the statements already mentioned above, here are a couple
of corner cases I noticed are not handled while playing with the
patch:

  define M(x) x

  int f (int i)
  {
if (i)
  M (--i; --i);   // can this be handled?

return i;
  }

and

  define M(x) x; x

  int f (int i)
  {
if (i)
  M (--i; --i);   // seems like this should be handled

return i;
  }

As an aside since it's outside the subset of the bigger problem
you chose to solve, there is a related issue with macros that
expand to an unparenthesized binary (and even some unary)
expression:

  #define sum(x, y) x + y

  int n = 2 * sum (3, 5);

I'm not very familiar with this area of the parser but I would
expect it to be relatively straightforward to extend your solution
to handle this problem as well.



For this I had to dive into line_maps, macro maps, etc., so CCing David to check
if my understanding of that is reasonable (hadn't worked with them before).

I've included this warning in -Wall, because there should be no false positives
(fingers crossed) and for most cases the warning should be pretty cheap.

I probably should've added a fix-it hint for good measure, too ("you better wrap
the damn macro in do {} while (0)"), but that can be done as a follow-up.


A hint I'm sure would be helpful to a lot of users.  One caveat
to be aware of is that wrapping an expression in a 'do { } while
(0)' is not a viable solution when the value of the last statement
is used.  In those cases, using the comma expression instead (in
parentheses) is often the way to go.  I'd expect determining which
to offer to be less than trivial.

Martin

Re: [PATCH v3, rs6000] gcc mainline, add builtin support for vec_doublee, vec_doubleo, vec_doublel builtins

2017-06-01 Thread Segher Boessenkool

Hi Carl,

On Thu, Jun 01, 2017 at 02:55:45PM -0700, Carl E. Love wrote:
> Add vec_doublee, vec_doubleo, vec_doublel, vec_doubleh builtin-ins
> 
> gcc/ChangeLog:
> 
> 2017-06-01  Carl Love  
> 
>* config/rs6000/rs6000-c: Add support for built-in functions
>vector double vec_doublee (vector signed int);
>vector double vec_doublee (vector unsigned int);
>vector double vec_doublee (vector float);
>vector double vec_doubleh (vector signed int);
>vector double vec_doubleh (vector unsigned int);
>vector double vec_doubleh (vector float);
>vector double vec_doublel (vector signed int);
>vector double vec_doublel (vector unsigned int);
>vector double vec_doublel (vector float);
>vector double vec_doubleo (vector signed int);
>vector double vec_doubleo (vector unsigned int);
>vector double vec_doubleo (vector float);.
>* config/rs6000/rs6000-builtin.def: Add definitions for DOUBLEE,
>DOUBLEO, DOUBLEH, DOUBLEL, UNS_DOUBLEO, UNS_DOUBLEE, UNS_DOUBLEH,
>UNS_DOUBLEL.
>* config/rs6000/altivec.md: Add code generator for doublee2,
>unsdoubleev4si2, doubleo2, unsdoubleov4si2, doubleh2,
>unsdoublehv4si2, doublel2, unsdoublelv4si2, add mode attribute
>VS_sxwsp.
>* config/rs6000/altivec.h: Add define for vec_doublee, vec_doubleo,
>vec_doublel, vec_doubleh.
>* doc/extend.texi: Update the built-in documentation file for the
>new built-in functions.
> 
> 2017-06-01  Carl Love  
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.target/powerpc/builtins-3-runnable.c: New file of runnable tests
>for the new built-ins.
> 
> Signed-off-by: Carl Love 

We don't do signoffs in GCC, fwiw.

> +(define_expand "doublee2"
> +  [(set (match_operand:V2DF 0 "register_operand" "=v")
> + (match_operand:VSX_W 1 "register_operand" "v"))]
> +  "TARGET_VSX"
> +{
> +   machine_mode op_mode = GET_MODE (operands[1]);

You indent with three spaces here, instead of two.

> +
> +   if (VECTOR_ELT_ORDER_BIG)
> + {

Here you do two, okay.

> + /* Big endian word numbering for words in operand is 0 1 2 3.
> +Input words 0 and 2 are where they need to be.  */
> + emit_insn (gen_vsx_xvcvdp (operands[0], operands[1]));
> + }
> +   else
> + {
> + /* Little endian word numbering for operand is 3 2 1 0.

But here you do three again.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> @@ -0,0 +1,83 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target vsx_hw } */
> +/* { dg-options "-O2 -mvsx -mcpu=power8" } */

This will then also need something like

/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */

Okay for trunk with that addition, and the final whitespace gotchas fixed.
Thanks,


Segher

[PATCH][Aarch64] Add vectorized mersenne twister

2017-06-01 Thread Michael Collison

This patch adds an vectorized implementation of the mersenne twister random 
number generator. This implementation is approximately 2.6 times faster than 
the non-vectorized implementation.

This implementation includes "arm_neon.h" when including the optimized 
.  This has the effect of polluting the global namespace with the 
Neon intrinsics, so user macros and functions could potentially clash with 
them.  Is this acceptable given this only happens when  is 
explicitly included? Comments and input are welcome.

Sample code to use the new generator would look like this:

#include 
#include 
#include 

int
main()
{
  __gnu_cxx::sfmt19937 mt(1729);

  std::uniform_int_distribution dist(0,1008);

  for (int i = 0; i < 16; ++i)
{
  std::cout << dist(mt) << " ";
}
}



2017-06-01  Michael Collison  

Add optimized implementation of mersenne twister for aarch64
* config/cpu/aarch64/opt/ext/opt_random.h: New file.
(__arch64_recursion): new function.
(operator==): New function.
(simd_fast_mersenne_twister_engine): New template class.
* config/cpu/aarch64/opt/bits/opt_random.h: New file.
* include/ext/random (add include for arm_neon.h):
(simd_fast_mersenne_twister_engine): add _M_state private
array for ARM_NEON conditional compilation.



gnutools-4218-v10.patch
Description: gnutools-4218-v10.patch

[PATCH][Aarch64] Relational compare zero not merged into subtract

2017-06-01 Thread Michael Collison

This patch improves code generation for relational compares against zero that 
are not merged into a subtract instruction. This patch improves the >= and < 
cases.

An example of the '<' case:

int lt (int x, int y)
{
  if ((x - y) < 0)
return 10;

  return 0;
}

Trunk generates:

lt:
sub w1, w0, w1
mov w0, 10
cmp w1, 0
cselw0, w0, wzr, lt
ret

With the patch we can eliminate the redundant subtract and now generate:

lt:
cmp w0, w1
mov w0, 10
cselw0, w0, wzr, mi
ret

Bootstrapped and tested on aarch64-linux-gnu. Okay for trunk?

2017-06-01  Michael Collison  

* config/aarch64/aarch64-simd.md(aarch64_sub_compare0):
New pattern.
* testsuite/gcc.target/aarch64/cmp-2.c: New testcase.


pr7261.patch.patch
Description: pr7261.patch.patch

Re: [PATCH] use the right conversion warning option (PR c/80892)

2017-06-01 Thread Eric Gallager

I tested this patch; it fixes the warnings that caused me to open the
bug in the first place. Thank you!

Eric

On 5/30/17, Martin Sebor  wrote:
> The conversion enhancements I committed in r248431 introduced
> an unintended change in which warning option is used to issue
> certain integer conversion warnings.  Attached is a fix.
>
> Martin
>

89 matches

Mail list logo