Re: [PATCH 13/25] Create TARGET_DISABLE_CURRENT_VECTOR_SIZE

2018-10-01 Thread Richard Biener
On Fri, Sep 28, 2018 at 2:47 PM Andrew Stubbs  wrote:
>
> On 19/09/18 14:45, Richard Biener wrote:
> > So I guess the current_vector_size thing isn't too hard to get rid of, what
> > you'd end up with would be using that size when you decide for vector
> > types for loads (where there are no USEs with vector types, so for example
> > this would not apply to gathers).
>
> I've finally got back to looking at this ...
>
> My patch works because current_vector_size is only referenced in two
> places. One is passed to get_vectype_for_scalar_type_and_size, and that
> function simply calls targetm.vectorize.preferred_simd_mode when the
> requested size is zero. The other is passed to build_truth_vector_type,
> which only uses it to call targetm.vectorize.get_mask_mode, and the GCN
> backend ignores the size parameter because it only has one option.
> Presumably other backends would object to a zero size mask.
>
> So, as I said originally, the effect is that leaving current_vector_size
> zeroed means "always ask the backend".

Yes.

> Pretty much everything else chains off of those places using
> get_same_sized_vectype, so ignoring current_vector_size is safe on GCN,
> and might even be safe on other architectures?

Other architectures really only use it when there's a choice, like
choosing between V4SI, V8SI and V16SI on x86_64.  current_vector_size
was introduced to be able to "iterate" over supported ISAs and let the
vectorizer decide which one to use in the end (SSE vs. AVX vs. AVX512).

The value of zero is simply to give the target another chance to set
its prefered
value based on the first call.  I'd call that a bit awkward (*)

For architectures that only have a single "vector size" this variable
is really spurious and whether it is zero or non-zero doesn't make a difference.
Apart from your architecture of course where non-zero doesn't work ;)

(*) so one possibility would be to forgo with the special-value of zero
("auto-detect") and thus not change current_vector_size in
get_vectype_for_scalar_type at all.  For targets which report multiple
vector size support set current_vector_size to the prefered one in the
loop over vector sizes and for targets that do not simply keep it at zero.

> > So I'd say you want to refactor get_same_sized_vectype uses and
> > make the size argument to get_vectype_for_scalar_type_and_size
> > a hint only.
>
> I've looked through the uses of get_same_sized_vectype and I've come to
> the conclusion that many of them really mean it.
>
> For example, vectorizable_bswap tries to reinterpret a vector register
> as a byte vector so that it can permute it. This is an optimization that
> won't work on GCN (because the vector registers don't work like that),
> but seems like a valid use of the vector size characteristic of other
> architectures.

True.

> For another example, vectorizable_conversion is targeting the
> vec_pack_trunc patterns, and therefore really does want to specify the
> types. Again, this isn't something we want to do on GCN (a regular trunc
> pattern with a vector mode will work fine).
>
> However, vectorizable_operation seems to use it to try to match the
> input and output types to the same vector unit (i.e. vector size); at
> least that's my interpretation. It returns "not vectorizable" if the
> input and output vectors have different numbers of elements. For most
> operators the lhs and rhs types will be the same, so we're all good, but
> I imagine that this code will prevent TRUNC being vectorized on GCN
> because the "same size" vector doesn't exist, and it doesn't check if
> there's a vector with the same number of elements (I've not actually
> tried that, yet, and there may be extra magic elsewhere for that case,
> but YSWIM).

Yeah, we don't have a get_vector_type_for_scalar_type_and_nelems
which would probably be semantically better in many places.

> I don't think changing this case to a new "get_same_length_vectype"
> would be appropriate for many architectures, so I'm not sure what to do
> here?
>
> We could fix this with new target hooks, perhaps?
>
> TARGET_VECTORIZE_REINTERPRET_VECTOR (vectype_in, scalartype_out)
>
>Returns a new vectype (or mode) that uses the same vector register as
>vectype_in, but has elements of scalartype_out.
>
>The default implementation would be get_same_sized_vectype.
>
>GCN would just return NULL, because you can't do that kind of
>optimization.
>
> TARGET_VECTORIZE_COMPATIBLE_VECTOR (opcode, vectype_in, scalartype_out)
>
>Returns a new vectype (or mode) that has the right number of elements
>for the opcode (i.e. the same number, or 2x for packed opcodes), and
>elements of scalartype_out.  The backend might choose a different
>vector size, but promises that hardware can do the operation (i.e.
>it's not mixing vector units).
>
>The default implementation would be get_same_sized_vectype, for
>backward compatibility.
>
>GCN would simply return V64xx according to scalartype

Re: [PATCH][IRA,LRA] Fix PR87466, all pseudos live across setjmp are spilled

2018-10-01 Thread Richard Biener
On Sat, Sep 29, 2018 at 5:12 AM Peter Bergner  wrote:
>
> Currently, both IRA and LRA spill all pseudo regs that are live across a
> setjmp call.  If the target has a sane setjmp, then the compiler should not
> have to treat the setjmp call any differently than is does any other normal
> function call.  Namely, just mark all pseudos that are live across the setjmp
> as conflicting with the volatile registers.
>
> This issue was discussed in the following gcc mailing list thread:
>
>   https://gcc.gnu.org/ml/gcc/2018-03/msg00014.html
>
> ...and some people mentioned that some systems do not have sane setjmp
> implementations and therefore need the spill all pseudos live across setjmps
> to get correct functionality.  It was decided in the thread above that we
> should create a target hook that can allow targets to tell IRA and LRA
> whether or not they have a sane setjmp implementation.  The following patch
> implements that idea along with converting the rs6000 port to use the hook.
>
> This patch passed bootstrap and regtesting on powerpc64le-linux with
> no regressions.  Ok for trunk?

LGTM.  Please leave others the opportunity to comment.

Thanks,
Richard.

> Peter
>
>
> gcc/
> PR rtl-optimization/87466
> * target.def (is_reg_clobbering_setjmp_p): New target hook.
> * doc/tm.texi.in (TARGET_IS_REG_CLOBBERING_SETJMP_P): New hook.
> * doc/tm.texi: Regenerate.
> * targhooks.c (default_is_reg_clobbering_setjmp_p): Declare.
> * targhooks.h (default_is_reg_clobbering_setjmp_p): New function.
> * ira-lives.c (process_bb_node_lives): Use the new target hook.
> * lra-lives.c (process_bb_lives): Likewise.
> * config/rs6000/rs6000.c (TARGET_IS_REG_CLOBBERING_SETJMP_P): Define.
> (rs6000_is_reg_clobbering_setjmp_p): New function.
>
> gcc/testsuite/
> PR rtl-optimization/87466
> * gcc.target/powerpc/pr87466.c: New test.
>
> Index: gcc/target.def
> ===
> --- gcc/target.def  (revision 264698)
> +++ gcc/target.def  (working copy)
> @@ -3123,6 +3123,20 @@ In order to enforce the representation o
>   int, (scalar_int_mode mode, scalar_int_mode rep_mode),
>   default_mode_rep_extended)
>
> + DEFHOOK
> +(is_reg_clobbering_setjmp_p,
> + "On some targets, it is assumed that the compiler will spill all 
> registers\n\
> +  that are live across a call to @code{setjmp}, while other targets treat\n\
> +  @code{setjmp} calls as normal function calls.\n\
> +  \n\
> +  This hook returns true if @var{insn} is a @code{setjmp} call that must\n\
> +  have all registers that are live across it spilled.  Define this to 
> return\n\
> +  false if the target does not need to spill all registers across calls to\n\
> +  @code{setjmp} calls.  The default implementation conservatively assumes 
> all\n\
> +  registers must be spilled across @code{setjmp} calls.",
> +bool, (const rtx_insn *insn),
> +default_is_reg_clobbering_setjmp_p)
> +
>  /* True if MODE is valid for a pointer in __attribute__((mode("MODE"))).  */
>  DEFHOOK
>  (valid_pointer_mode,
> Index: gcc/doc/tm.texi.in
> ===
> --- gcc/doc/tm.texi.in  (revision 264698)
> +++ gcc/doc/tm.texi.in  (working copy)
> @@ -7507,6 +7507,8 @@ You need not define this macro if it wou
>
>  @hook TARGET_MODE_REP_EXTENDED
>
> +@hook TARGET_IS_REG_CLOBBERING_SETJMP_P
> +
>  @defmac STORE_FLAG_VALUE
>  A C expression describing the value returned by a comparison operator
>  with an integral mode and stored by a store-flag instruction
> Index: gcc/doc/tm.texi
> ===
> --- gcc/doc/tm.texi (revision 264698)
> +++ gcc/doc/tm.texi (working copy)
> @@ -11000,6 +11000,18 @@ In order to enforce the representation o
>  @code{mode}.
>  @end deftypefn
>
> +@deftypefn {Target Hook} bool TARGET_IS_REG_CLOBBERING_SETJMP_P (const 
> rtx_insn *@var{insn})
> +On some targets, it is assumed that the compiler will spill all registers
> +  that are live across a call to @code{setjmp}, while other targets treat
> +  @code{setjmp} calls as normal function calls.
> +
> +  This hook returns true if @var{insn} is a @code{setjmp} call that must
> +  have all registers that are live across it spilled.  Define this to return
> +  false if the target does not need to spill all registers across calls to
> +  @code{setjmp} calls.  The default implementation conservatively assumes all
> +  registers must be spilled across @code{setjmp} calls.
> +@end deftypefn
> +
>  @defmac STORE_FLAG_VALUE
>  A C expression describing the value returned by a comparison operator
>  with an integral mode and stored by a store-flag instruction
> Index: gcc/targhooks.c
> ===
> --- gcc/targhooks.c (revision 264698)
> +++ gcc/targhooks.c (working copy)
> @@ -209,6 +209,15 @@ default_bui

Re: [patch] Fix PR tree-optimization/86659

2018-10-01 Thread Richard Biener
On Fri, Sep 28, 2018 at 7:01 PM Eric Botcazou  wrote:
>
> Hi,
>
> this is a regression introduced by the canonicalization of BIT_FIELD_REF in
> match.pd, which totally disregards the REF_REVERSE_STORAGE_ORDER flag, and
> visible as the failure of gnat.dg/sso/q[23].adb on SPARC 64-bit.  But the
> underlying issue of the missing propagation of the flag during GIMPLE folding
> has probably been latent for quite some time on all active branches.
>
> Tested on x86-64/Linux and SPARC/Solaris, OK for mainline?  And branches?

@@ -853,7 +857,10 @@ gimple_simplify (gimple *stmt, gimple_ma
op0 = do_valueize (op0, top_valueize, valueized);
res_op->set_op (code, type, op0,
TREE_OPERAND (rhs1, 1),
-   TREE_OPERAND (rhs1, 2));
+   TREE_OPERAND (rhs1, 2),
+   REF_REVERSE_STORAGE_ORDER (rhs1));
+   if (res_op->reverse)
+ return valueized;
return (gimple_resimplify3 (seq, res_op, valueize)
|| valueized);
  }

so the fix is to simply not optimize here?  Are there correctness issues
with the patterns we have for rev-storage?  But then some cases are let through
via the realpart/imagpart/v_c_e case?  I suppose we should never see
REF_REVERSE_STORAGE_ORDER on refs operating on registers (SSA_NAMEs
or even is_gimple_reg()s)?

Note that I think you need to adjust the GENERIC side as well, for example:

static tree
generic_simplify_BIT_FIELD_REF (location_t ARG_UNUSED (loc), enum
tree_code ARG_UNUSED (code), const tree ARG_UNUSED (type), tree op0,
tree op1, tree op2)
{
...
case VIEW_CONVERT_EXPR:
  {
tree o20 = TREE_OPERAND (op0, 0);
{
/* #line 4690 "/tmp/trunk2/gcc/match.pd" */
  tree captures[3] ATTRIBUTE_UNUSED = { o20, op1, op2 };
  if (__builtin_expect (dump_file && (dump_flags &
TDF_FOLDING), 0)) fprintf (dump_file, "Applying pattern %s:%d,
%s:%d\n", "match.pd", 4690, __FILE__, __LINE__);
  tree res_op0;
  res_op0 = captures[0];
  tree res_op1;
  res_op1 = captures[1];
  tree res_op2;
  res_op2 = captures[2];
  tree res;
  res = fold_build3_loc (loc, BIT_FIELD_REF, type, res_op0,
res_op1, res_op2);

where we lose the reverse-storage attribute as well.  You'd probably
have to cut out
rev-storage refs somewhere in genmatch.c.

Richard.

>
> 2018-09-28  Eric Botcazou  
>
> PR tree-optimization/86659
> * gimple-match.h (struct gimple_match_op): Add reverse field.
> (gimple_match_op::set_op): New overloaded method.
> * gimple-match-head.c (maybe_build_generic_op) : Set
> the REF_REVERSE_STORAGE_ORDER flag on the value.
> (gimple_simplify) : For BIT_FIELD_REF, propagate the
> REF_REVERSE_STORAGE_ORDER flag and avoid simplifying if it is set.
>
> --
> Eric Botcazou


Re: No a*x+b*x factorization for signed vectors

2018-10-01 Thread Richard Biener
On Sat, Sep 29, 2018 at 1:06 PM Marc Glisse  wrote:
>
> Hello,
>
> this is a simple patch to remove the wrong-code part of PR 87319. I didn't
> spend much time polishing that code, since it is meant to disappear
> anyway.
>
> We could probably remove the inner == inner2 test in
> signed_or_unsigned_type_for, I hadn't noticed when copy-pasting the code.
>
> bootstrap+regtest on powerpc64le-unknown-linux-gnu.

OK.

Thanks,
Richard.

> 2018-09-30  Marc Glisse  
>
> PR middle-end/87319
> * fold-const.c (fold_plusminus_mult_expr): Handle complex and vectors.
> * tree.c (signed_or_unsigned_type_for): Handle complex.
>
> --
> Marc Glisse


Re: ((X /[ex] A) +- B) * A --> X +- A * B

2018-10-01 Thread Richard Biener
On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse  wrote:
>
> Hello,
>
> I noticed quite ugly code from both testcases. This transformation does
> not fix either, but it helps a bit.

I'm curious why you chose to restrict to INTEGER_CST A and B?
Is that because of the case when (X / [ex] A) +- B evaluates to zero
but A * B overflows?  Can that ever happen?  Isn't it enough to know
that A isn't -1?  That is, can we use expr_not_equal_to or friends
to put constraints on possibly non-constant A/B?

Otherwise the patch is of course OK and the above would just improve
it.

Thanks,
Richard.

> bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>
> 2018-09-30  Marc Glisse  
>
> gcc/
> * match.pd (((X /[ex] A) +- B) * A): New transformation.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/muldiv-1.c: New file.
> * gcc.dg/tree-ssa/muldiv-2.c: Likewise.
>
> --
> Marc Glisse


Re: Fold more boolean expressions

2018-10-01 Thread Richard Biener
On Sun, Sep 30, 2018 at 5:11 PM MCC CS  wrote:
>
>
> Now that it has got enough reviews and there's
> been no comments for a week, I believe
> now it's time for us to install it on trunk.
> The patch is the same as previous, but rebased
> on current trunk.
>
> Could you please push it for me? If there's anything
> I can do to help, just tell me.

I'll push it after a round of testing on my side.

Thanks and sorry for the delay,
Richard.

> 2018-09-30 MCC CS 
>
> gcc/
> PR tree-optimization/87261
> * match.pd: Add boolean optimizations,
> fix whitespace.
>
> 2018-09-30 MCC CS 
>
> gcc/testsuite/
> PR tree-optimization/87261
> * gcc.dg/pr87261.c: New test.
>
> Index: gcc/match.pd
> ===
> --- gcc/match.pd(revision 264725)
> +++ gcc/match.pd(working copy)
> @@ -92,7 +92,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>IFN_FMA IFN_FMS IFN_FNMA IFN_FNMS)
>  (define_operator_list COND_TERNARY
>IFN_COND_FMA IFN_COND_FMS IFN_COND_FNMA IFN_COND_FNMS)
> -
> +
>  /* As opposed to convert?, this still creates a single pattern, so
> it is not a suitable replacement for convert? in all cases.  */
>  (match (nop_convert @0)
> @@ -106,7 +106,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>&& tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE 
> (@0))
>  /* This one has to be last, or it shadows the others.  */
>  (match (nop_convert @0)
> - @0)
> + @0)
>
>  /* Transform likes of (char) ABS_EXPR <(int) x> into (char) ABSU_EXPR 
> ABSU_EXPR returns unsigned absolute value of the operand and the operand
> @@ -285,7 +285,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   And not for _Fract types where we can't build 1.  */
>(if (!integer_zerop (@0) && !ALL_FRACT_MODE_P (TYPE_MODE (type)))
> { build_one_cst (type); }))
> - /* X / abs (X) is X < 0 ? -1 : 1.  */
> + /* X / abs (X) is X < 0 ? -1 : 1.  */
>   (simplify
> (div:C @0 (abs @0))
> (if (INTEGRAL_TYPE_P (type)
> @@ -929,6 +929,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(bitop:c @0 (bit_not (bitop:cs @0 @1)))
>(bitop @0 (bit_not @1
>
> +/* (~x & y) | ~(x | y) -> ~x */
> +(simplify
> + (bit_ior:c (bit_and:c (bit_not@2 @0) @1) (bit_not (bit_ior:c @0 @1)))
> + @2)
> +
> +/* (x | y) ^ (x | ~y) -> ~x */
> +(simplify
> + (bit_xor:c (bit_ior:c @0 @1) (bit_ior:c @0 (bit_not @1)))
> + (bit_not @0))
> +
> +/* (x & y) | ~(x | y) -> ~(x ^ y) */
> +(simplify
> + (bit_ior:c (bit_and:cs @0 @1) (bit_not:s (bit_ior:s @0 @1)))
> + (bit_not (bit_xor @0 @1)))
> +
> +/* (~x | y) ^ (x ^ y) -> x | ~y */
> +(simplify
> + (bit_xor:c (bit_ior:cs (bit_not @0) @1) (bit_xor:s @0 @1))
> + (bit_ior @0 (bit_not @1)))
> +
> +/* (x ^ y) | ~(x | y) -> ~(x & y) */
> +(simplify
> + (bit_ior:c (bit_xor:cs @0 @1) (bit_not:s (bit_ior:s @0 @1)))
> + (bit_not (bit_and @0 @1)))
> +
>  /* (x | y) & ~x -> y & ~x */
>  /* (x & y) | ~x -> y | ~x */
>  (for bitop (bit_and bit_ior)
> @@ -1139,7 +1164,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> && tree_nop_conversion_p (type, TREE_TYPE (@1)))
> (mult (convert @0) (convert (negate @1)
> -
> +
>  /* -(A + B) -> (-B) - A.  */
>  (simplify
>   (negate (plus:c @0 negate_expr_p@1))
> @@ -3099,7 +3124,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (tree_int_cst_sgn (@1) < 0)
>   (scmp @0 @2)
>   (cmp @0 @2))
> -
> +
>  /* Simplify comparison of something with itself.  For IEEE
> floating-point, we can only do some of these simplifications.  */
>  (for cmp (eq ge le)
> @@ -3170,11 +3195,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  }
>tree newtype
>  = (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type1)
> -  ? TREE_TYPE (@0) : type1);
> +  ? TREE_TYPE (@0) : type1);
>  }
>  (if (TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (newtype))
>   (cmp (convert:newtype @0) (convert:newtype @1))
> -
> +
>   (simplify
>(cmp @0 REAL_CST@1)
>/* IEEE doesn't distinguish +0 and -0 in comparisons.  */
> @@ -3422,7 +3447,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (FTYPE) N == CST -> 0
> (FTYPE) N != CST -> 1.  */
> (if (cmp == EQ_EXPR || cmp == NE_EXPR)
> -{ constant_boolean_node (cmp == NE_EXPR, type); })
> +{ constant_boolean_node (cmp == NE_EXPR, type); })
> /* Otherwise replace with sensible integer constant.  */
> (with
>  {
> @@ -3666,7 +3691,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (simplify
>(cmp (bit_and@2 @0 integer_pow2p@1) @1)
>(icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
> -
> +
>  /* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
> convert this into a shift followed by ANDing with D.  */
>  (simplify
> @@ -3886,7 +3911,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (cmp == LE_EXPR)
>  (ge (convert:st @0) { build_zero_cst (st); })
>  (lt (convert:st @0) 

Re: [PATCH][IRA,LRA] Fix PR87466, all pseudos live across setjmp are spilled

2018-10-01 Thread Segher Boessenkool
Hi Peter,

On Fri, Sep 28, 2018 at 10:12:02PM -0500, Peter Bergner wrote:
> Currently, both IRA and LRA spill all pseudo regs that are live across a
> setjmp call.  If the target has a sane setjmp, then the compiler should not
> have to treat the setjmp call any differently than is does any other normal
> function call.  Namely, just mark all pseudos that are live across the setjmp
> as conflicting with the volatile registers.
> 
> This issue was discussed in the following gcc mailing list thread:
> 
>   https://gcc.gnu.org/ml/gcc/2018-03/msg00014.html
> 
> ...and some people mentioned that some systems do not have sane setjmp
> implementations and therefore need the spill all pseudos live across setjmps
> to get correct functionality.  It was decided in the thread above that we
> should create a target hook that can allow targets to tell IRA and LRA
> whether or not they have a sane setjmp implementation.  The following patch
> implements that idea along with converting the rs6000 port to use the hook.

> +bool
> +default_is_reg_clobbering_setjmp_p (const rtx_insn *insn)
> +{
> +  return CALL_P (insn)
> +  && find_reg_note (insn, REG_SETJMP, NULL_RTX) != NULL_RTX;
> +}

Since all implementations of this hook will have to do the same, I think
it is better if you leave this test at the (only two) callers.  The hook
doesn't need an argument then, and maybe is better named something like
setjmp_is_normal_call?  (The original code did not test CALL_P btw).

(Whatever you end up with, the rs6000 part is of course pre-approved).


Segher


Re: [PATCH][IRA,LRA] Fix PR87466, all pseudos live across setjmp are spilled

2018-10-01 Thread Eric Botcazou
> Since all implementations of this hook will have to do the same, I think
> it is better if you leave this test at the (only two) callers.  The hook
> doesn't need an argument then, and maybe is better named something like
> setjmp_is_normal_call?  (The original code did not test CALL_P btw).

Seconded, but I'd be even more explicit in the naming of the hook, for example 
setjmp_preserves_nonvolatile_registers or somesuch.  (And I don't think that 
setjmp can be considered a normal call in any case since it returns twice).

-- 
Eric Botcazou


[PATCH] Fix typo, fixing PR87465

2018-10-01 Thread Richard Biener


The following typo-fix happens to fix a --param max-peel-branches limit
caused missed peeling.  The typo is present everywhere, the missed
peeling is a regression from GCC 7.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

I'm not really considering to backport this anywhere.  Note the
testcase isn't fully optimized on the tree level because
DOM doesn't figure out the trivial CSE after SLP vectorizes the
array init (we have PRs for that issue).

Richard.

2018-10-01  Richard Biener  

PR tree-optimization/87465
* tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Fix typo
causing branch miscounts.

* gcc.dg/tree-ssa/cunroll-15.c: New testcase.

Index: gcc/tree-ssa-loop-ivcanon.c
===
--- gcc/tree-ssa-loop-ivcanon.c (revision 264734)
+++ gcc/tree-ssa-loop-ivcanon.c (working copy)
@@ -368,8 +368,8 @@ tree_estimate_loop_size (struct loop *lo
size->non_call_stmts_on_hot_path++;
  if (((gimple_code (stmt) == GIMPLE_COND
&& (!constant_after_peeling (gimple_cond_lhs (stmt), stmt, loop)
-   || constant_after_peeling (gimple_cond_rhs (stmt), stmt,
-  loop)))
+   || !constant_after_peeling (gimple_cond_rhs (stmt), stmt,
+   loop)))
   || (gimple_code (stmt) == GIMPLE_SWITCH
   && !constant_after_peeling (gimple_switch_index (
 as_a  (stmt)),
Index: gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c  (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-cunroll-optimized" } */
+
+int Test(void)
+{
+  int c = 0;
+  const int in[4] = {4,3,4,4};
+  for (unsigned i = 0; i < 4; i++) {
+  for (unsigned j = 0; j < i; j++) {
+ if (in[i] == in[j])
+   break;
+ else 
+   ++c;
+  }
+  }
+  return c;
+}
+
+/* { dg-final { scan-tree-dump-times "optimized:\[^\n\r\]*completely unrolled" 
2 "cunroll" } } */
+/* Only RTL figures out some CSE at the moment.  */
+/* { dg-final { scan-tree-dump "return 1;" "optimized" { xfail *-*-* } } } */


Re: [PATCH 2/2] [ARC] Avoid specific constants to end in limm field.

2018-10-01 Thread Claudiu Zissulescu
Pushed. Thank you for your review,
Claudiu
On Fri, Sep 21, 2018 at 12:12 AM Andrew Burgess
 wrote:
>
> * Claudiu Zissulescu  [2018-09-17 15:50:27 +0300]:
>
> > The 3-operand instructions accepts to place an immediate into the
> > second operand. However, this immediate will end up in the long
> > immediate field. This patch avoids constants to end up in the limm
> > field for particular instructions when compiling for size.
> >
> > gcc/
> > -xx-xx  Claudiu Zissulescu  
> >
> >   * config/arc/arc.md (*add_n): Clean up pattern, update instruction
> >   constraints.
> >   (ashlsi3_insn): Update instruction constraints.
> >   (ashrsi3_insn): Likewise.
> >   (rotrsi3): Likewise.
> >   (add_shift): Likewise.
> >   * config/arc/constraints.md (Csz): New 32 bit constraint. It
> >   avoids placing in the limm field small constants which, otherwise,
> >   could end into a small instruction.
> > ---
> >  gcc/config/arc/arc.md   | 51 +---
> >  gcc/config/arc/constraints.md   |  6 +++
> >  gcc/testsuite/gcc.target/arc/tph_addx.c | 53 +
> >  3 files changed, 78 insertions(+), 32 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arc/tph_addx.c
>
> Looks good.
>
> Thanks,
> Andrew
>
>
> >
> > diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> > index 2d108ef166d..c28a87cd3b0 100644
> > --- a/gcc/config/arc/arc.md
> > +++ b/gcc/config/arc/arc.md
> > @@ -3056,30 +3056,17 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> > (set (match_dup 3) (match_dup 4))])
> >
> >  (define_insn "*add_n"
> > -  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,Rcw,W,W,w,w")
> > - (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" 
> > "Rcqq,c,c,c,c,c")
> > - (match_operand:SI 2 "_1_2_3_operand" ""))
> > -  (match_operand:SI 3 "nonmemory_operand" 
> > "0,0,c,?Cal,?c,??Cal")))]
> > +  [(set (match_operand:SI 0 "dest_reg_operand" "=q,r,r")
> > + (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "q,r,r")
> > +   (match_operand:SI 2 "_2_4_8_operand" ""))
> > +  (match_operand:SI 3 "nonmemory_operand" "0,r,Csz")))]
> >""
> > -  "add%c2%? %0,%3,%1%&"
> > +  "add%z2%?\\t%0,%3,%1%&"
> >[(set_attr "type" "shift")
> > -   (set_attr "length" "*,4,4,8,4,8")
> > -   (set_attr "predicable" "yes,yes,no,no,no,no")
> > -   (set_attr "cond" "canuse,canuse,nocond,nocond,nocond,nocond")
> > -   (set_attr "iscompact" "maybe,false,false,false,false,false")])
> > -
> > -(define_insn "*add_n"
> > -  [(set (match_operand:SI 0 "dest_reg_operand"  
> > "=Rcqq,Rcw,W,  W,w,w")
> > - (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "Rcqq,  c,c, 
> >  c,c,c")
> > -   (match_operand:SI 2 "_2_4_8_operand"   ""))
> > -  (match_operand:SI 3 "nonmemory_operand""0,  
> > 0,c,Cal,c,Cal")))]
> > -  ""
> > -  "add%z2%? %0,%3,%1%&"
> > -  [(set_attr "type" "shift")
> > -   (set_attr "length" "*,4,4,8,4,8")
> > -   (set_attr "predicable" "yes,yes,no,no,no,no")
> > -   (set_attr "cond" "canuse,canuse,nocond,nocond,nocond,nocond")
> > -   (set_attr "iscompact" "maybe,false,false,false,false,false")])
> > +   (set_attr "length" "*,4,8")
> > +   (set_attr "predicable" "yes,no,no")
> > +   (set_attr "cond" "canuse,nocond,nocond")
> > +   (set_attr "iscompact" "maybe,false,false")])
> >
> >  ;; N.B. sub[123] has the operands of the MINUS in the opposite order from
> >  ;; what synth_mult likes.
> > @@ -3496,7 +3483,7 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> >  ; provide one alternatice for this, without condexec support.
> >  (define_insn "*ashlsi3_insn"
> >[(set (match_operand:SI 0 "dest_reg_operand"   
> > "=Rcq,Rcqq,Rcqq,Rcw, w,   w")
> > - (ashift:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  0, 
> > c,cCal")
> > + (ashift:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  0, 
> > c,cCsz")
> >  (match_operand:SI 2 "nonmemory_operand"  "K,  K,RcqqM, 
> > cL,cL,cCal")))]
> >"TARGET_BARREL_SHIFTER
> > && (register_operand (operands[1], SImode)
> > @@ -3509,7 +3496,7 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> >
> >  (define_insn "*ashrsi3_insn"
> >[(set (match_operand:SI 0 "dest_reg_operand" 
> > "=Rcq,Rcqq,Rcqq,Rcw, w,   w")
> > - (ashiftrt:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  
> > 0, c,cCal")
> > + (ashiftrt:SI (match_operand:SI 1 "nonmemory_operand" "!0,Rcqq,   0,  
> > 0, c,cCsz")
> >(match_operand:SI 2 "nonmemory_operand"  "K,  K,RcqqM, 
> > cL,cL,cCal")))]
> >"TARGET_BARREL_SHIFTER
> > && (register_operand (operands[1], SImode)
> > @@ -3536,7 +3523,7 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> >
> >  (define_insn "rotrsi3"
> >[(set (match_operand:SI 0 "dest_reg_operand" "=Rcw, w,   w")
> > - (rotatert:SI (match_oper

Re: [PATCH 1/2] [ARC] Check for odd-even register when emitting double mac ops.

2018-10-01 Thread Claudiu Zissulescu
Pushed with suggested changes.

Thank you for your review,
Claudiu
On Thu, Sep 20, 2018 at 11:59 PM Andrew Burgess
 wrote:
>
> * Claudiu Zissulescu  [2018-09-17 15:50:26 +0300]:
>
> > Avoid generate dmac instructions when the register is not odd-even,
> > use instead the equivalent mac instruction.
> >
> > gcc/
> >   Claudiu Zissulescu  
> >
> >   * config/arc/arc.md (maddsidi4_split): Don't use dmac if the
> >   destination register is not odd-even.
> >   (umaddsidi4_split): Likewise.
> >
> > gcc/testsuite/
> >   Claudiu Zissulescu  
> >
> >   * gcc.target/arc/tmac-3.c: New file.
>
> Looks good thanks, with one minor nit below...
>
> > ---
> >  gcc/config/arc/arc.md |  4 ++--
> >  gcc/testsuite/gcc.target/arc/tmac-3.c | 17 +
> >  2 files changed, 19 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arc/tmac-3.c
> >
> > diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> > index dbcd7098bec..2d108ef166d 100644
> > --- a/gcc/config/arc/arc.md
> > +++ b/gcc/config/arc/arc.md
> > @@ -6078,7 +6078,7 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> >"{
> > rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST);
> > emit_move_insn (acc_reg, operands[3]);
> > -   if (TARGET_PLUS_MACD)
> > +   if (TARGET_PLUS_MACD && even_register_operand (operands[0], DImode))
> >   emit_insn (gen_macd (operands[0], operands[1], operands[2]));
> > else
> >   {
> > @@ -6178,7 +6178,7 @@ core_3, archs4x, archs4xd, archs4xd_slow"
> >"{
> > rtx acc_reg = gen_rtx_REG (DImode, ACC_REG_FIRST);
> > emit_move_insn (acc_reg, operands[3]);
> > -   if (TARGET_PLUS_MACD)
> > +   if (TARGET_PLUS_MACD && even_register_operand (operands[0], DImode))
> >   emit_insn (gen_macdu (operands[0], operands[1], operands[2]));
> > else
> >   {
> > diff --git a/gcc/testsuite/gcc.target/arc/tmac-3.c 
> > b/gcc/testsuite/gcc.target/arc/tmac-3.c
> > new file mode 100644
> > index 000..3c8c1201f83
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/arc/tmac-3.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-skip-if "" { ! { clmcpu } } } */
> > +/* { dg-options "-mcpu=hs38 -Os" } */
> > +
> > +/* The compiler will assign r1r2 as a DI register, but it doesn't fit
> > +   the macd operation, hence we need to fall back on the mac
> > +   instruction.  */
> > +typedef long long myint64_t;
> > +
> > +extern int d (int, myint64_t);
> > +int b (int c)
> > +{
> > +  int x = (int) d;
> > +  d(c, (myint64_t)x * 2 + 1);
>
> Could you apply GNU coding standard whitespace on this line please.
>
> Thanks,
> Andrew
>
> > +}
> > +
> > +/* { dg-final { scan-assembler "mac\\\s+r1" } } */
> > --
> > 2.17.1
> >


[c-family] Small fix for -fdump-ada-spec

2018-10-01 Thread Eric Botcazou
The translation of a pointer to constant value was lacking the package prefix.

Tested on x86_64-suse-linux, applied on the mainline.


2018-10-01  Eric Botcazou  

* c-ada-spec.c (get_underlying_decl): Get to the main type variant.
(dump_ada_node): Add const keyword.

-- 
Eric BotcazouIndex: c-ada-spec.c
===
--- c-ada-spec.c	(revision 264732)
+++ c-ada-spec.c	(working copy)
@@ -1020,13 +1020,18 @@ get_underlying_decl (tree type)
   if (DECL_P (type))
 return type;
 
-  /* type is a typedef.  */
-  if (TYPE_P (type) && TYPE_NAME (type) && DECL_P (TYPE_NAME (type)))
-return TYPE_NAME (type);
-
-  /* TYPE_STUB_DECL has been set for type.  */
-  if (TYPE_P (type) && TYPE_STUB_DECL (type))
-return TYPE_STUB_DECL (type);
+  if (TYPE_P (type))
+{
+  type = TYPE_MAIN_VARIANT (type);
+
+  /* type is a typedef.  */
+  if (TYPE_NAME (type) && DECL_P (TYPE_NAME (type)))
+	return TYPE_NAME (type);
+
+  /* TYPE_STUB_DECL has been set for type.  */
+  if (TYPE_STUB_DECL (type))
+	return TYPE_STUB_DECL (type);
+}
 
   return NULL_TREE;
 }
@@ -2143,8 +2148,8 @@ dump_ada_node (pretty_printer *buffer, t
 	}
   else
 	{
+	  const unsigned int quals = TYPE_QUALS (TREE_TYPE (node));
 	  bool is_access = false;
-	  unsigned int quals = TYPE_QUALS (TREE_TYPE (node));
 
 	  if (VOID_TYPE_P (TREE_TYPE (node)))
 	{


Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Richard Biener
On Fri, 28 Sep 2018, Jan Hubicka wrote:

> Hi,
> this is a proof-of-concept patch for type merging during LTO streaming. It
> does two things
> 1) replace type variant by first compatible one in TYPE_NEXT_VARIANT list
>This is useful at compilation time because frontends produce more variants
>than needed. The effect of this is about 2% of decl stream
> 2) replace ODR types by their prevailing variant if ODR violation was not
>detected.  This is useful at WPA time and saves about 50% of global decl
>stream. For Firefox it reduces the ltrans.o files size from 2.6GB to 2.0GB.
> 
> Here are stats of number of nodes hitting global stream during WPA->ltrans
> streaming of firefox:
> 
> Before  after
>73547280 namespace_decl
>   46593   14084 union_type
>   18496   15405 translation_unit_decl
>   22816   21548 integer_type
>  107047   31681 enumeral_type
>   67072   46390 array_type
>  541220   99572 pointer_plus_expr
>  542360  100712 addr_expr
>  167657  154019 var_decl
>  864769  182691 tree_binfo
>  240200  206783 reference_type
>  410403  316877 function_type
> 1862985  522524 type_decl
> 1954864  652664 record_type
> 1070333  880209 method_type
> 1582055 1014270 pointer_type
> 1495367 1406670 tree_list
> 7926385 1483545 field_decl
> 1715133 1725612 function_decl
> 3384810 3191406 identifier_node
> 
> So largest savings are in field_decls (to 18% and they are not even merged
> fully by the patch), pointer tpes (to 64%), record types (to 30%) and type
> decls (to 28%), binfos addr_expr/pointer_plus_expr (the later are used to
> reffer to vtables to 21%). Patch bootstraps, regtestes and seems to
> work reliably.
> 
> The merging is currently done during streaming by simply looking up prevailing
> type and populating the cache (so we do not get too many repeated lookups).
> Similar tricks can be added to checksum calculation.
> 
> I am not sure this is best possible place, especially for 2) since it provides
> really quite major reduction in the IL size.  I am quite sure we do not want
> to do that at stream in time in parallel to SCC merging because it affects the
> SCC components and also we do not want to merge this way types containing ODR
> violations as QOI issue. (merging two completely different types would lead
> to ICEs but merging two TBAA incompatible types is probably equally bad).
> 
> So we need to do following
>  a) read in the the types & do tree mering
>  b) populate odr type hash, do ODR violation checks and decide on what types
> are consistent
>  c) do the ODR based merging sometime before the trees hit ltrans stream.
> 
> I was thinking that perhaps c) can also be added to mentions_vars_p machinery
> which would make it somewhat more costy but we could free those duplicates and
> save some more memory during WPA.  I am also not usre if longer term we
> would not want to have mode that bypasses WPA->ltrans stream completely and
> just compile out of WPA process in threads or forks.
> Drawback is that building vector recording every single type pointer is going
> to also consume quite some memory.
> 
> Patch is also not complete because it does not merge field decls. Analogous
> tricks can be added to streamer, but I just tought I would like to discuss
> alternative implementations first.
> 
> WPA compile time improves by about 11%, so merging benefits overcome the extra
> complexity in sreamer.
> 
> I hope that with this patch I will be able to increase default number of 
> partitions
> since the global stream is getting less percentage of the ltrans files. 33%
> is still quite a lot, but far better than what we had previously. Hopefully
> we could still dramatically cut this down in followup.
> 
> Also obvious drawback is that this helps to C++ only. I looked into stats from
> C programs and also Toon's stats on Fortran and I tend to believe that C++
> is really much worse than those two.

The ODR savings really look good but as you say the implementation is
somewhat "tricky".

I'd like to see the type-variant done separately (of course) and
also differently.  This is because when enabling 
free-lang-data by default we should be able to get benefits for
non-LTO compilations as well.  Specifically I'd like us to
(in free-lang-data) "canonicalize" existing variants applying the
rules you derived for the lazier compare.  At the point you then
drop non-fld-walked variants we could weed out the resulting
duplicates, keeping only a prevailing one in the list and (ab-)using
GC to fixup references.  Like with ggc_register_fixup (void *from, void 
*to) which would when following edges at mark time, adjust them
according to this map (I'm not sure if we'd want to do it manually
and record a vector of possibly affected TREE_TYPE uses during the fld
walk).  We could also (easily?) optimize the streaming itself by
streaming only differences to the main variant of a type (but then
we could change our tree data structure to make that trivial).

I wonder how much "ODR" merging 

Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Jan Hubicka
> 
> The ODR savings really look good but as you say the implementation is
> somewhat "tricky".
> 
> I'd like to see the type-variant done separately (of course) and
> also differently.  This is because when enabling 
> free-lang-data by default we should be able to get benefits for
> non-LTO compilations as well.  Specifically I'd like us to
> (in free-lang-data) "canonicalize" existing variants applying the
> rules you derived for the lazier compare.  At the point you then
> drop non-fld-walked variants we could weed out the resulting
> duplicates, keeping only a prevailing one in the list and (ab-)using
> GC to fixup references.  Like with ggc_register_fixup (void *from, void 
> *to) which would when following edges at mark time, adjust them
> according to this map (I'm not sure if we'd want to do it manually
> and record a vector of possibly affected TREE_TYPE uses during the fld
> walk).  We could also (easily?) optimize the streaming itself by
> streaming only differences to the main variant of a type (but then
> we could change our tree data structure to make that trivial).

I had patch for streaming the differences only some time ago. My recollection
is that we got it into tree and then reverted as there was some issues with
cycles.  I can look into this again.

I would really like to avoid using ggc for IL rewriting. One reason is
that eventually we would like to see GGC to go and thus we should not
wire it more into the essential parts of the compiler and other is that
GGC is often not run. 

Saving type variants seems to have relatively minor effect on the IL size, so
we need to have solution that is not too expensive to be justified.  I suppose
free lang data is sort of visiting all the relevant datastructures for
middle-end so we could do rewriting there.  It would also make sense to
canonicalize types more based on knowledge whether they are used for memory
access (i.e. for non-accesses go to main variant and perhaps invent something
like main variant WRT useless conversions).

We could ggc_free the old variant and then ggc will ICE on dangling pointers.
I can give it a try with my patch to prune the variant tree.
> 
> I wonder how much "ODR" merging we'd get for free when we canonicalize
> types some more - that is, how do ODR equal but tree unequal types
> usually differ?  I guess it might be most of the time fields with
> pointer to incomplete vs. complete type?

Most of the time it is complete vs incomplete pointer type.
I had statistics of type duplicates before, but they did not look as bad
as reality. The reason is that smaller types tends to be merged while
bigger types tends to be duplicated many times.  In GCC this is typically
RTL, gimple and derived types which are really many.

Honza
> 
> Thanks,
> Richard.
> 
> 
> > Honza
> > 
> > Index: ipa-devirt.c
> > ===
> > --- ipa-devirt.c(revision 264689)
> > +++ ipa-devirt.c(working copy)
> > @@ -2111,6 +2111,29 @@ get_odr_type (tree type, bool insert)
> >return val;
> >  }
> >  
> > +/* Return the main variant of the odr type.  This is used for straming out
> > +   to reduce number of type duplicates hitting the WPA->LTRANS streams.
> > +   Do not do so when ODR violation was detected since the type may be
> > +   structurally different then.  */
> > +
> > +tree
> > +prevailing_odr_type (tree t)
> > +{
> > +  t = TYPE_MAIN_VARIANT (t);
> > +  /* In need_assembler_name_p we also mangle assembler names of 
> > INTEGER_TYPE.
> > + We can not merge these because this does not honnor precision and
> > + signedness.  */
> > +  if (!type_with_linkage_p (t)
> > +  || type_in_anonymous_namespace_p (t)
> > +  || TREE_CODE (t) == INTEGER_TYPE
> > +  || !COMPLETE_TYPE_P (t))
> > +return t;
> > +  odr_type ot = get_odr_type (t, true);
> > +  if (!ot || !ot->odr_violated)
> > +return ot->type;
> > +  return t;
> > +}
> > +
> >  /* Add TYPE od ODR type hash.  */
> >  
> >  void
> > Index: ipa-utils.h
> > ===
> > --- ipa-utils.h (revision 264689)
> > +++ ipa-utils.h (working copy)
> > @@ -90,6 +90,7 @@ void warn_types_mismatch (tree t1, tree
> >   location_t loc2 = UNKNOWN_LOCATION);
> >  bool odr_or_derived_type_p (const_tree t);
> >  bool odr_types_equivalent_p (tree type1, tree type2);
> > +tree prevailing_odr_type (tree t);
> >  
> >  /* Return vector containing possible targets of polymorphic call E.
> > If COMPLETEP is non-NULL, store true if the list is complete. 
> > Index: lto/lto.c
> > ===
> > --- lto/lto.c   (revision 264689)
> > +++ lto/lto.c   (working copy)
> > @@ -485,6 +485,8 @@ gimple_register_canonical_type_1 (tree t
> >  static void
> >  gimple_register_canonical_type (tree t)
> >  {
> > +  if (flag_checking)
> > +verify_type (t);
> >if (TYPE_CANONICAL (t) ||

Re: [PATCH][IRA,LRA] Fix PR87466, all pseudos live across setjmp are spilled

2018-10-01 Thread Segher Boessenkool
On Mon, Oct 01, 2018 at 11:25:21AM +0200, Eric Botcazou wrote:
> > Since all implementations of this hook will have to do the same, I think
> > it is better if you leave this test at the (only two) callers.  The hook
> > doesn't need an argument then, and maybe is better named something like
> > setjmp_is_normal_call?  (The original code did not test CALL_P btw).
> 
> Seconded, but I'd be even more explicit in the naming of the hook, for 
> example 
> setjmp_preserves_nonvolatile_registers or somesuch.  (And I don't think that 
> setjmp can be considered a normal call in any case since it returns twice).

Right...  I meant setjmp has the normal calling convention, the normal
call ABI.  It doesn't really matter to have a longer name here, it is
only used twice (and that code can be factored out to some helper
function, even).


Segher


Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Richard Biener
On Mon, 1 Oct 2018, Jan Hubicka wrote:

> > 
> > The ODR savings really look good but as you say the implementation is
> > somewhat "tricky".
> > 
> > I'd like to see the type-variant done separately (of course) and
> > also differently.  This is because when enabling 
> > free-lang-data by default we should be able to get benefits for
> > non-LTO compilations as well.  Specifically I'd like us to
> > (in free-lang-data) "canonicalize" existing variants applying the
> > rules you derived for the lazier compare.  At the point you then
> > drop non-fld-walked variants we could weed out the resulting
> > duplicates, keeping only a prevailing one in the list and (ab-)using
> > GC to fixup references.  Like with ggc_register_fixup (void *from, void 
> > *to) which would when following edges at mark time, adjust them
> > according to this map (I'm not sure if we'd want to do it manually
> > and record a vector of possibly affected TREE_TYPE uses during the fld
> > walk).  We could also (easily?) optimize the streaming itself by
> > streaming only differences to the main variant of a type (but then
> > we could change our tree data structure to make that trivial).
> 
> I had patch for streaming the differences only some time ago. My recollection
> is that we got it into tree and then reverted as there was some issues with
> cycles.  I can look into this again.
> 
> I would really like to avoid using ggc for IL rewriting. One reason is
> that eventually we would like to see GGC to go and thus we should not
> wire it more into the essential parts of the compiler and other is that
> GGC is often not run. 

Heh.

> Saving type variants seems to have relatively minor effect on the IL size, so
> we need to have solution that is not too expensive to be justified.  I suppose
> free lang data is sort of visiting all the relevant datastructures for
> middle-end so we could do rewriting there.  It would also make sense to
> canonicalize types more based on knowledge whether they are used for memory
> access (i.e. for non-accesses go to main variant and perhaps invent something
> like main variant WRT useless conversions).

I guess for that it would be better to put things like memory access
alignment into the memory-references rather than using TYPE_ALIGN.
Likewise for other things affecting semantics (TYPE_REF_CAN_ALIAS_ALL).

Being able to simply drop to TYPE_MAIN_VARIANT would be very appealing...
(and simplify things).  The hard thing is to figure out where we look
into those variant differences during late compilation...

Type variants was always my first "easy" middle-end type related cleanup.
And for non-LTO ODR merging probably gets us nothing (the FE should do
the merging).

> We could ggc_free the old variant and then ggc will ICE on dangling pointers.
> I can give it a try with my patch to prune the variant tree.

Yes, definitely do that.  I also _really_ like us to do FLD 
unconditionally - maybe we can start by guarding individual pieces with
a for_lto flag we pass down.  But esp. disabling langhooks and stuff like
the variant type purging would be good to get enabled for non-LTO
to not have that modes diverge more and more...

> > 
> > I wonder how much "ODR" merging we'd get for free when we canonicalize
> > types some more - that is, how do ODR equal but tree unequal types
> > usually differ?  I guess it might be most of the time fields with
> > pointer to incomplete vs. complete type?
> 
> Most of the time it is complete vs incomplete pointer type.
> I had statistics of type duplicates before, but they did not look as bad
> as reality. The reason is that smaller types tends to be merged while
> bigger types tends to be duplicated many times.  In GCC this is typically
> RTL, gimple and derived types which are really many.

I see.  So one possible canonicalization is to make _all_
pointer-typed FIELD_DECLs point to incomplete variants since the memory
accesses should already have the "proper" access types.  Can you
get statistics on that?  Not sure how to get an "incomplete" type
though (iff we can simply copy the type and NULL TYPE_FIELDs and
TYPE_SIZE and friends) - again I'd do that at FLD time.

Richard.

> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > 
> > > Honza
> > > 
> > > Index: ipa-devirt.c
> > > ===
> > > --- ipa-devirt.c  (revision 264689)
> > > +++ ipa-devirt.c  (working copy)
> > > @@ -2111,6 +2111,29 @@ get_odr_type (tree type, bool insert)
> > >return val;
> > >  }
> > >  
> > > +/* Return the main variant of the odr type.  This is used for straming 
> > > out
> > > +   to reduce number of type duplicates hitting the WPA->LTRANS streams.
> > > +   Do not do so when ODR violation was detected since the type may be
> > > +   structurally different then.  */
> > > +
> > > +tree
> > > +prevailing_odr_type (tree t)
> > > +{
> > > +  t = TYPE_MAIN_VARIANT (t);
> > > +  /* In need_assembler_name_p we also mangle assembler names of 
>

Re: [PATCH] GCOV: introduce --json-format.

2018-10-01 Thread Martin Liška
On 9/27/18 3:55 PM, David Malcolm wrote:
> On Thu, 2018-09-27 at 09:46 +0200, Martin Liška wrote:
>> Hi.
>>
>> For some time we've been providing an intermediate format as
>> output of GCOV tool. It's documented in our code that primary
>> consumer of it is lcov. Apparently that's not true:
>> https://github.com/linux-test-project/lcov/issues/38#issuecomment-371
>> 203750
>>
>> So that I decided to come up with providing the very similar
>> intermediate
>> format in JSON format. It's much easier for consumers to work with.
>>
>> I'm planning to leave the legacy format for GCC 9.1 and I'll document
>> that
>> it's deprecated. We can then remove that in next release.
>>
>> The patch contains a small change to json.{h,cc}, hope David can
>> approve that?
>> Patch is tested on x86_64-linux-gnu.
> 
> I'm not officially a reviewer for the json stuff, but I can comment, at
> least.  The changes to json.h/cc are fine by me, FWIW.

Hello.

Appreciate your feedback!

> 
> Some high-level observations:
> * how big are the JSON files?  One of the comments at my Cauldron talk
> on optimization records was a concern that the output could be very
> large.  The JSON files compress really well, so maybe this patch should
> gzip on output?  Though I don't know if it's appropriate for this case.
>   iirc, gfortran's module code has an example of gzipping a
> pretty_printer buffer.

Probably sounds reasonable, for tramp3d I get: 5.8M, where original
intermediate format is 1.5M big. Gzipped JSON format is 216K.
I'll update the patch to address that.

> 
> * json::object::print doesn't preserve the insertion order of its
> name/value pairs; they're written out in whatever order the hashing
> leads to (maybe we should fix that though).  The top-level JSON value
> in your file format is currently a JSON object containing "version"
> metadata etc.  There's thus a chance that could be at the end, after
> the data.  Perhaps the top-level JSON value should be an array instead
> (as a "tuple"), to guarantee that the versioning metadata occurs at the
> start?  Or are consumers assumed to read the whole file into memory and
> traverse it, tree-like?

Well, it's not nice to do list from something that is object. It's 
micro-optimization
in my opinion.

On the other hand I would preserve the order of keys as they were added into
an object. Moreover, we can have option for that:

https://docs.python.org/3/library/json.html#basic-usage

Search for sort_keys. Similarly I would appreciate 'indent' option, it's handy
for debugging.

> 
> * Similar to the optimization records code, this patch is building a
> tree of dynamically-allocated JSON objects in memory, and then printing
> it to a pretty_printer, and flushing the pp's buffer to a FILE *.  This
> is simple and flexible, but I suspect that I may need to rewrite this
> for the optimization records case to avoid bloating the memory usage
> (e.g. to eliminate the in-memory JSON tree in favor of printing as we
> go). Would that concern apply here, or is the pattern OK?

One should not bloat memory in GCOV I guess.

> 
> * FWIW I'm working on DejaGnu test coverage for JSON output, but it's
> ugly - a nasty shim between .exp and (optional) .py scripts for parsing
> the JSON and verifying properties of it, sending the results back in a
> form that DejaGnu can digest and integrate into the .sum/.log files.  I
> hope to post it before stage 1 closes (I'd much prefer to have good
> test coverage in place before optimizing how this stuff gets written)

Would be good, then I would definitely add a test for GCOV JSON format.

> 
> [...snip...]
>> diff --git a/gcc/gcov.c b/gcc/gcov.c
>> index e255e4e3922..39d9329d6d0 100644
>> --- a/gcc/gcov.c
>> +++ b/gcc/gcov.c
> 
> [...snip...]
> 
>> @@ -1346,6 +1481,15 @@ generate_results (const char *file_name)
>>  }
>>  }
>>  
>> +  json::object *root = new json::object ();
> 
> [...snip...]
>>  
>> -  if (flag_gcov_file && flag_intermediate_format && !flag_use_stdout)
>> +  if (flag_gcov_file && flag_json_format)
>> +root->dump (out);
>> +
> 
> It looks like "root" gets leaked here (and thus all of the objects in
> the JSON tree also).   I don't know if that's a problem, but an easy
> fix would be to make "root" be an on-stack object, rather than
> allocated on the heap - this ought to lead everything to be cleaned up
> when root's dtor runs.

I've just tried valgrind and I can't see any leak in current implementation?

Martin

> 
> Dave
> 



New Russian PO file for 'gcc' (version 8.2.0)

2018-10-01 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Russian team of translators.  The file is available at:

http://translationproject.org/latest/gcc/ru.po

(This file, 'gcc-8.2.0.ru.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Richard Biener
On Mon, 1 Oct 2018, Richard Biener wrote:

> On Mon, 1 Oct 2018, Jan Hubicka wrote:
> 
> > > 
> > > The ODR savings really look good but as you say the implementation is
> > > somewhat "tricky".
> > > 
> > > I'd like to see the type-variant done separately (of course) and
> > > also differently.  This is because when enabling 
> > > free-lang-data by default we should be able to get benefits for
> > > non-LTO compilations as well.  Specifically I'd like us to
> > > (in free-lang-data) "canonicalize" existing variants applying the
> > > rules you derived for the lazier compare.  At the point you then
> > > drop non-fld-walked variants we could weed out the resulting
> > > duplicates, keeping only a prevailing one in the list and (ab-)using
> > > GC to fixup references.  Like with ggc_register_fixup (void *from, void 
> > > *to) which would when following edges at mark time, adjust them
> > > according to this map (I'm not sure if we'd want to do it manually
> > > and record a vector of possibly affected TREE_TYPE uses during the fld
> > > walk).  We could also (easily?) optimize the streaming itself by
> > > streaming only differences to the main variant of a type (but then
> > > we could change our tree data structure to make that trivial).
> > 
> > I had patch for streaming the differences only some time ago. My 
> > recollection
> > is that we got it into tree and then reverted as there was some issues with
> > cycles.  I can look into this again.
> > 
> > I would really like to avoid using ggc for IL rewriting. One reason is
> > that eventually we would like to see GGC to go and thus we should not
> > wire it more into the essential parts of the compiler and other is that
> > GGC is often not run. 
> 
> Heh.
> 
> > Saving type variants seems to have relatively minor effect on the IL size, 
> > so
> > we need to have solution that is not too expensive to be justified.  I 
> > suppose
> > free lang data is sort of visiting all the relevant datastructures for
> > middle-end so we could do rewriting there.  It would also make sense to
> > canonicalize types more based on knowledge whether they are used for memory
> > access (i.e. for non-accesses go to main variant and perhaps invent 
> > something
> > like main variant WRT useless conversions).
> 
> I guess for that it would be better to put things like memory access
> alignment into the memory-references rather than using TYPE_ALIGN.
> Likewise for other things affecting semantics (TYPE_REF_CAN_ALIAS_ALL).
> 
> Being able to simply drop to TYPE_MAIN_VARIANT would be very appealing...
> (and simplify things).  The hard thing is to figure out where we look
> into those variant differences during late compilation...
> 
> Type variants was always my first "easy" middle-end type related cleanup.
> And for non-LTO ODR merging probably gets us nothing (the FE should do
> the merging).
> 
> > We could ggc_free the old variant and then ggc will ICE on dangling 
> > pointers.
> > I can give it a try with my patch to prune the variant tree.
> 
> Yes, definitely do that.  I also _really_ like us to do FLD 
> unconditionally - maybe we can start by guarding individual pieces with
> a for_lto flag we pass down.  But esp. disabling langhooks and stuff like
> the variant type purging would be good to get enabled for non-LTO
> to not have that modes diverge more and more...
> 
> > > 
> > > I wonder how much "ODR" merging we'd get for free when we canonicalize
> > > types some more - that is, how do ODR equal but tree unequal types
> > > usually differ?  I guess it might be most of the time fields with
> > > pointer to incomplete vs. complete type?
> > 
> > Most of the time it is complete vs incomplete pointer type.
> > I had statistics of type duplicates before, but they did not look as bad
> > as reality. The reason is that smaller types tends to be merged while
> > bigger types tends to be duplicated many times.  In GCC this is typically
> > RTL, gimple and derived types which are really many.
> 
> I see.  So one possible canonicalization is to make _all_
> pointer-typed FIELD_DECLs point to incomplete variants since the memory
> accesses should already have the "proper" access types.  Can you
> get statistics on that?  Not sure how to get an "incomplete" type
> though (iff we can simply copy the type and NULL TYPE_FIELDs and
> TYPE_SIZE and friends) - again I'd do that at FLD time.

So sth like

 tp = build_distinct_type_copy (t);
 TYPE_FIELDS (tp) = NULL_TREE;
 TYPE_SIZE (tp) = NULL_TREE;
 TYPE_SIZE_UNIT (tp) = NULL_TREE;
 tp = type_hash_canon (tp);

of course we "leak" the original type in used COMPONENT_REFs
(may also cause some verifier ICEs here if the types mismatch that
of the FIELD_DECLs) and in aggregate copies, etc.  But I wonder
how much "unused" unnecessary types we have.  That is, I'd paper
over the ICEs this causes and not fixup the IL stream at first for
example.

Richard.

> Richard.
> 
> > Honza
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > > 
> > > >

Re: vector _M_start and 0 offset

2018-10-01 Thread Jonathan Wakely

On 29/09/18 10:56 +0200, Marc Glisse wrote:

Hello,

here is a clang-friendly version of the patch (same changelog), tested 
a while ago. Is it ok or do you prefer something like the


+ if(this->_M_impl._M_start._M_offset != 0) __builtin_unreachable();

version suggested by François?


I don't think __builtin_unreachable would improve the clarity of the code.

The patch is OK for trunk, thanks.




Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-01 Thread H.J. Lu
On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner  wrote:
>
> On 9/30/18 7:57 PM, H.J. Lu wrote:
> > This caused:
> >
> > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ \\t]+c@GOTOFF[(]%ebx[)]
>
> Can you check whether the new generated code is at least as good
> as the old generated code?  I'm assuming the code we generate now isn't
> wrong, just different and maybe we just need to change what we expect
> to see.

I checked gcc.target/i386/pr63527.c and it has a regression.

Before:

 :
   0: 53push   %ebx
   1: e8 fc ff ff ffcall   2 
   6: 81 c3 02 00 00 00add$0x2,%ebx
   c: 83 ec 08  sub$0x8,%esp
   f: e8 fc ff ff ffcall   10 
  14: e8 fc ff ff ffcall   15 
  19: 83 c4 08  add$0x8,%esp
  1c: 5bpop%ebx
  1d: c3ret

Disassembly of section .text.__x86.get_pc_thunk.bx:

 <__x86.get_pc_thunk.bx>:
   0: 8b 1c 24  mov(%esp),%ebx
   3: c3ret

After:

 :
   0: 56push   %esi
   1: e8 fc ff ff ffcall   2 
   6: 81 c6 02 00 00 00add$0x2,%esi
   c: 53push   %ebx
   d: 83 ec 04  sub$0x4,%esp
  10: 89 f3mov%esi,%ebx
  12: e8 fc ff ff ffcall   13 
  17: e8 fc ff ff ffcall   18 
  1c: 83 c4 04  add$0x4,%esp
  1f: 5bpop%ebx
  20: 5epop%esi
  21: c3ret

Disassembly of section .text.__x86.get_pc_thunk.si:

 <__x86.get_pc_thunk.si>:
   0: 8b 34 24  mov(%esp),%esi
   3: c3ret

-- 
H.J.


Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-01 Thread H.J. Lu
On Mon, Oct 1, 2018 at 5:44 AM H.J. Lu  wrote:
>
> On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner  wrote:
> >
> > On 9/30/18 7:57 PM, H.J. Lu wrote:
> > > This caused:
> > >
> > > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ 
> > > \\t]+c@GOTOFF[(]%ebx[)]
> >
> > Can you check whether the new generated code is at least as good
> > as the old generated code?  I'm assuming the code we generate now isn't
> > wrong, just different and maybe we just need to change what we expect
> > to see.
>
> I checked gcc.target/i386/pr63527.c and it has a regression.
>
> Before:
>
>  :
>0: 53push   %ebx
>1: e8 fc ff ff ffcall   2 
>6: 81 c3 02 00 00 00add$0x2,%ebx
>c: 83 ec 08  sub$0x8,%esp
>f: e8 fc ff ff ffcall   10 
>   14: e8 fc ff ff ffcall   15 
>   19: 83 c4 08  add$0x8,%esp
>   1c: 5bpop%ebx
>   1d: c3ret
>
> Disassembly of section .text.__x86.get_pc_thunk.bx:
>
>  <__x86.get_pc_thunk.bx>:
>0: 8b 1c 24  mov(%esp),%ebx
>3: c3ret
>
> After:
>
>  :
>0: 56push   %esi
>1: e8 fc ff ff ffcall   2 
>6: 81 c6 02 00 00 00add$0x2,%esi
>c: 53push   %ebx
>d: 83 ec 04  sub$0x4,%esp
>   10: 89 f3mov%esi,%ebx
>   12: e8 fc ff ff ffcall   13 
>   17: e8 fc ff ff ffcall   18 
>   1c: 83 c4 04  add$0x4,%esp
>   1f: 5bpop%ebx
>   20: 5epop%esi
>   21: c3ret
>
> Disassembly of section .text.__x86.get_pc_thunk.si:
>
>  <__x86.get_pc_thunk.si>:
>0: 8b 34 24  mov(%esp),%esi
>3: c3ret
>

You may have undone:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059

-- 
H.J.


Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-01 Thread H.J. Lu
On Mon, Oct 1, 2018 at 5:45 AM H.J. Lu  wrote:
>
> On Mon, Oct 1, 2018 at 5:44 AM H.J. Lu  wrote:
> >
> > On Sun, Sep 30, 2018 at 6:18 PM Peter Bergner  wrote:
> > >
> > > On 9/30/18 7:57 PM, H.J. Lu wrote:
> > > > This caused:
> > > >
> > > > FAIL: gcc.target/i386/pr63527.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > > FAIL: gcc.target/i386/pr63534.c scan-assembler-not movl[ \t]%[^,]+, %ebx
> > > > FAIL: gcc.target/i386/pr64317.c scan-assembler addl[
> > > > \\t]+[$]_GLOBAL_OFFSET_TABLE_, %ebx
> > > > FAIL: gcc.target/i386/pr64317.c scan-assembler movl[ 
> > > > \\t]+c@GOTOFF[(]%ebx[)]
> > >
> > > Can you check whether the new generated code is at least as good
> > > as the old generated code?  I'm assuming the code we generate now isn't
> > > wrong, just different and maybe we just need to change what we expect
> > > to see.
> >
> > I checked gcc.target/i386/pr63527.c and it has a regression.
> >
> > Before:
> >
> >  :
> >0: 53push   %ebx
> >1: e8 fc ff ff ffcall   2 
> >6: 81 c3 02 00 00 00add$0x2,%ebx
> >c: 83 ec 08  sub$0x8,%esp
> >f: e8 fc ff ff ffcall   10 
> >   14: e8 fc ff ff ffcall   15 
> >   19: 83 c4 08  add$0x8,%esp
> >   1c: 5bpop%ebx
> >   1d: c3ret
> >
> > Disassembly of section .text.__x86.get_pc_thunk.bx:
> >
> >  <__x86.get_pc_thunk.bx>:
> >0: 8b 1c 24  mov(%esp),%ebx
> >3: c3ret
> >
> > After:
> >
> >  :
> >0: 56push   %esi
> >1: e8 fc ff ff ffcall   2 
> >6: 81 c6 02 00 00 00add$0x2,%esi
> >c: 53push   %ebx
> >d: 83 ec 04  sub$0x4,%esp
> >   10: 89 f3mov%esi,%ebx
> >   12: e8 fc ff ff ffcall   13 
> >   17: e8 fc ff ff ffcall   18 
> >   1c: 83 c4 04  add$0x4,%esp
> >   1f: 5bpop%ebx
> >   20: 5epop%esi
> >   21: c3ret
> >
> > Disassembly of section .text.__x86.get_pc_thunk.si:
> >
> >  <__x86.get_pc_thunk.si>:
> >0: 8b 34 24  mov(%esp),%esi
> >3: c3ret
> >
>
> You may have undone:
>
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059
>

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87479

-- 
H.J.


[PATCH][C]/[C++] Remove DECL_FROM_INLINE use

2018-10-01 Thread Richard Biener


This patch removes checks of DECL_FROM_INLINE from the respective
-Wshadow code of the C/C++ FE.  I noticed those when looking after
DECL_ABSTRACT_ORIGIN uses.  Those checks may be from times where
we did inlining very early?

Bootstrapped and tested on x86_64-unknown-linux-gnu with no regression.

OK for trunk?

Thanks,
Richard.

2018-10-01  Richard Biener  

c/
* c-decl.c (warn_if_shadowing): Do not test DECL_FROM_INLINE.

cp/
* name-lookup.c (check_local_shadow): Do not test DECL_FROM_INLINE.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  (revision 264757)
+++ gcc/c/c-decl.c  (working copy)
@@ -2784,9 +2784,7 @@ warn_if_shadowing (tree new_decl)
 || warn_shadow_local
 || warn_shadow_compatible_local)
   /* No shadow warnings for internally generated vars.  */
-  || DECL_IS_BUILTIN (new_decl)
-  /* No shadow warnings for vars made for inlining.  */
-  || DECL_FROM_INLINE (new_decl))
+  || DECL_IS_BUILTIN (new_decl))
 return;
 
   /* Is anything being shadowed?  Invisible decls do not count.  */
Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c(revision 264757)
+++ gcc/cp/name-lookup.c(working copy)
@@ -2628,10 +2628,6 @@ check_local_shadow (tree decl)
   if (TREE_CODE (decl) == PARM_DECL && !DECL_CONTEXT (decl))
 return;
 
-  /* Inline decls shadow nothing.  */
-  if (DECL_FROM_INLINE (decl))
-return;
-
   /* External decls are something else.  */
   if (DECL_EXTERNAL (decl))
 return;


Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Jan Hubicka
> > I see.  So one possible canonicalization is to make _all_
> > pointer-typed FIELD_DECLs point to incomplete variants since the memory
> > accesses should already have the "proper" access types.  Can you
> > get statistics on that?  Not sure how to get an "incomplete" type
> > though (iff we can simply copy the type and NULL TYPE_FIELDs and
> > TYPE_SIZE and friends) - again I'd do that at FLD time.
> 
> So sth like
> 
>  tp = build_distinct_type_copy (t);
>  TYPE_FIELDS (tp) = NULL_TREE;
>  TYPE_SIZE (tp) = NULL_TREE;
>  TYPE_SIZE_UNIT (tp) = NULL_TREE;
>  tp = type_hash_canon (tp);
> 
> of course we "leak" the original type in used COMPONENT_REFs
> (may also cause some verifier ICEs here if the types mismatch that
> of the FIELD_DECLs) and in aggregate copies, etc.  But I wonder
> how much "unused" unnecessary types we have.  That is, I'd paper
> over the ICEs this causes and not fixup the IL stream at first for
> example.

I had patch to play with this as well, let me see if I can revive it.
One problem here is that we will lose info about ODR violations that happens
through pointers.

Honza


Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Richard Biener
On Mon, 1 Oct 2018, Jan Hubicka wrote:

> > > I see.  So one possible canonicalization is to make _all_
> > > pointer-typed FIELD_DECLs point to incomplete variants since the memory
> > > accesses should already have the "proper" access types.  Can you
> > > get statistics on that?  Not sure how to get an "incomplete" type
> > > though (iff we can simply copy the type and NULL TYPE_FIELDs and
> > > TYPE_SIZE and friends) - again I'd do that at FLD time.
> > 
> > So sth like
> > 
> >  tp = build_distinct_type_copy (t);
> >  TYPE_FIELDS (tp) = NULL_TREE;
> >  TYPE_SIZE (tp) = NULL_TREE;
> >  TYPE_SIZE_UNIT (tp) = NULL_TREE;
> >  tp = type_hash_canon (tp);
> > 
> > of course we "leak" the original type in used COMPONENT_REFs
> > (may also cause some verifier ICEs here if the types mismatch that
> > of the FIELD_DECLs) and in aggregate copies, etc.  But I wonder
> > how much "unused" unnecessary types we have.  That is, I'd paper
> > over the ICEs this causes and not fixup the IL stream at first for
> > example.
> 
> I had patch to play with this as well, let me see if I can revive it.
> One problem here is that we will lose info about ODR violations that happens
> through pointers.

How so, if we keep the mangled name of the pointed-to types?

Richard.

> Honza
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: RFC: variant and ODR based type merging during LTO streaming

2018-10-01 Thread Jan Hubicka
> On Mon, 1 Oct 2018, Jan Hubicka wrote:
> 
> > > > I see.  So one possible canonicalization is to make _all_
> > > > pointer-typed FIELD_DECLs point to incomplete variants since the memory
> > > > accesses should already have the "proper" access types.  Can you
> > > > get statistics on that?  Not sure how to get an "incomplete" type
> > > > though (iff we can simply copy the type and NULL TYPE_FIELDs and
> > > > TYPE_SIZE and friends) - again I'd do that at FLD time.
> > > 
> > > So sth like
> > > 
> > >  tp = build_distinct_type_copy (t);
> > >  TYPE_FIELDS (tp) = NULL_TREE;
> > >  TYPE_SIZE (tp) = NULL_TREE;
> > >  TYPE_SIZE_UNIT (tp) = NULL_TREE;
> > >  tp = type_hash_canon (tp);
> > > 
> > > of course we "leak" the original type in used COMPONENT_REFs
> > > (may also cause some verifier ICEs here if the types mismatch that
> > > of the FIELD_DECLs) and in aggregate copies, etc.  But I wonder
> > > how much "unused" unnecessary types we have.  That is, I'd paper
> > > over the ICEs this causes and not fixup the IL stream at first for
> > > example.
> > 
> > I had patch to play with this as well, let me see if I can revive it.
> > One problem here is that we will lose info about ODR violations that happens
> > through pointers.
> 
> How so, if we keep the mangled name of the pointed-to types?

If you have "struct a" which violates ODR rule between units foo and
bar, then you need to make difference between struct a *ptr defined in
foo and one in bar. If tree merging merges them, we lose this info.
It is true that we can ten just declare any structure containing "struct a*"
as ODR violating that is probably good enough.

honza
> 
> Richard.
> 
> > Honza
> > 
> > 
> 
> -- 
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)


Re: [PATCH, AArch64 10/11] aarch64: Implement TImode compare-and-swap

2018-10-01 Thread Matthew Malcomson

Hi Richard,

On 26/09/18 06:03, rth7...@gmail.com wrote:

From: Richard Henderson 

This pattern will only be used with the __sync functions, because
we do not yet have a bare TImode atomic load.


Does this mean that the libatomic `defined(atomic_compare_exchange_n)` 
checks would return false for 16 bytes sizes?

(the acinclude.m4 file checks for __atomic_compare_exchange_n)

You would know better than I, but if that's the case it seems that the 
atomic_{load,store}_16 implementations in libatomic would still use the 
locking ABI, and e.g. atomic_load_16 could be interrupted by using the 
CASP instruction to produce an incorrect value.


Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-01 Thread Peter Bergner
On 10/1/18 7:50 AM, H.J. Lu wrote:
> On Mon, Oct 1, 2018 at 5:45 AM H.J. Lu  wrote:
>> You may have undone:
>>
>> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059
>>
> 
> I opened:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87479


Thanks for checking.  I'll have a look.

Peter





Re: [PATCH] libgo: Don't assume sys.GoarchAmd64 == 64-bit pointer

2018-10-01 Thread Ian Lance Taylor
"H.J. Lu"  writes:

> On Sat, Sep 29, 2018 at 9:01 PM Ian Lance Taylor  wrote:
>>
>> "H.J. Lu"  writes:
>>
>> > On x86-64, sys.GoarchAmd64 == 1 for -mx32.  But -mx32 has 32-bit
>> > pointer, not 64-bit.  There is
>> >
>> > // _64bit = 1 on 64-bit systems, 0 on 32-bit systems
>> > _64bit = 1 << (^uintptr(0) >> 63) / 2
>> >
>> > We should check both _64bit and sys.GoarchAmd64.
>>
>> Thanks, but I think the correct fix is to set GOARCH to amd64p32 when
>> using x32.  I'm trying that to see if it will work out.
>>
>
> My understanding is amd64p32 == NaCl.  But x32 != NaCl.

For Go, GOARCH=amd64p32 GOOS=nacl is NaCl.  But logically amd64p32 means
amd64 with 32-bit pointers, so it may be possible to repurpose
GOARCH=amd64p32 GOOS=linux for x32.

Ian


Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}

2018-10-01 Thread David Malcolm
On Sun, 2018-09-30 at 00:12 +0200, Andreas Schwab wrote:
> That produces extra output that breaks a few tests.
> 
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++11 (test for excess errors)
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++14 (test for excess errors)
> g++.dg/vect/pr33426-ivdep-2.cc  -std=c++98 (test for excess errors)
> g++.dg/vect/pr33426-ivdep-3.cc   (test for excess errors)
> g++.dg/vect/pr33426-ivdep-4.cc   (test for excess errors)
> g++.dg/vect/pr33426-ivdep.cc  -std=c++11 (test for excess errors)
> g++.dg/vect/pr33426-ivdep.cc  -std=c++14 (test for excess errors)
> g++.dg/vect/pr33426-ivdep.cc  -std=c++98 (test for excess errors)
> gcc.dg/vect/nodump-vect-opt-info-1.c (test for excess errors)
> gcc.dg/vect/vect-ivdep-1.c (test for excess errors)
> gcc.dg/vect/vect-ivdep-1.c -flto -ffat-lto-objects (test for excess
> errors)
> gcc.dg/vect/vect-ivdep-2.c (test for excess errors)
> gcc.dg/vect/vect-ivdep-2.c -flto -ffat-lto-objects (test for excess
> errors)
> 
> FAIL: gcc.dg/vect/vect-ivdep-1.c (test for excess errors)
> Excess errors:
> /usr/local/gcc/gcc-20180929/gcc/testsuite/gcc.dg/vect/vect-ivdep-
> 1.c:11:3: optimized:  loop versioned for vectorization to enhance
> alignment

Thanks for the report; sorry about the breakage.

What target is this for?  I'm not seeing these issues on x86_64-pc-
linux-gnu.

I think that what's happening is that my patch changed various existing
dump messages from -fopt-info from being "note: " to being "optimized:
" or "missed: ".

gcc/testsuite/lib/prune.exp has:

# Ignore informational notes.
regsub -all "(^|\n)\[^\n\]*: note: \[^\n\]*" $text "" text

which strips out all notes after dg-message directives have been
checked.  Presumably these pre-existing "note: " dump messages were
being ignored, and are no longer matching that pattern.

I can see two approaches to fixing this:

(a) extend those lines in prune.exp to also ignore "optimized: " and
"missed: "

(b) figure out the criteria for when these messages appear, and add new
 dg-optimized and dg-missed directives to the tests in questions, with
suitable filters.

Is there a link to the .log files somewhere so I can see the precise
messages in question?  (e.g. are they all "loop versioned for
vectorization to enhance alignment"?).

Thanks, and sorry again about the breakage.
Dave


Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}

2018-10-01 Thread Andreas Schwab
On Okt 01 2018, David Malcolm  wrote:

> Is there a link to the .log files somewhere so I can see the precise
> messages in question?  (e.g. are they all "loop versioned for
> vectorization to enhance alignment"?).

Yes, they are all the same message.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [libstdc++,doc] adjust link to www.oracle.com

2018-10-01 Thread Jonathan Wakely

On 30/09/18 16:06 +0200, Gerald Pfeifer wrote:

I applied the patch below.  Given a number of fixes in libstdc++/doc
I've applied recently, can one of you please regenerate the HTML pages
in the next days?


Done and committed as r264760.

   Regenerate libstdc++ HTML pages
   
   * doc/html/*: Regenerate.





RFC: x87 reduc_plus_scal_* AVX (and AVX512?) expanders

2018-10-01 Thread Richard Biener

I notice that for Zen we create

  0.00 │   vhaddp %ymm3,%ymm3,%ymm3
  1.41 │   vperm2 $0x1,%ymm3,%ymm3,%ymm1
  1.45 │   vaddpd %ymm1,%ymm2,%ymm2

from reduc_plus_scal_v4df which uses a cross-lane permute vperm2f128
even though the upper half of the result is unused in the end
(we only use the single-precision element zero).  Much better would
be to use vextractf128 which is well-pipelined and has good throughput
(though using vhaddp in itself is quite bad for Zen I didn't try
benchmarking it against open-coding that yet, aka disabling the
expander).  I can generate

vhaddpd %ymm3, %ymm3, %ymm3
vextractf128$0x1, %ymm3, %xmm1
vaddpd  %xmm1, %xmm3, %xmm3

with

Index: gcc/config/i386/sse.md
===
--- gcc/config/i386/sse.md  (revision 264758)
+++ gcc/config/i386/sse.md  (working copy)
@@ -2474,12 +2474,12 @@ (define_expand "reduc_plus_scal_v4df"
   "TARGET_AVX"
 {
   rtx tmp = gen_reg_rtx (V4DFmode);
-  rtx tmp2 = gen_reg_rtx (V4DFmode);
-  rtx vec_res = gen_reg_rtx (V4DFmode);
+  rtx tmp2 = gen_reg_rtx (V2DFmode);
+  rtx vec_res = gen_reg_rtx (V2DFmode);
   emit_insn (gen_avx_haddv4df3 (tmp, operands[1], operands[1]));
-  emit_insn (gen_avx_vperm2f128v4df3 (tmp2, tmp, tmp, GEN_INT (1)));
-  emit_insn (gen_addv4df3 (vec_res, tmp, tmp2));
-  emit_insn (gen_vec_extractv4dfdf (operands[0], vec_res, const0_rtx));
+  emit_insn (gen_vec_extract_hi_v4df (tmp2, tmp));
+  emit_insn (gen_addv2df3 (vec_res, gen_lowpart (V2DFmode, tmp), tmp2));
+  emit_insn (gen_vec_extractv2dfdf (operands[0], vec_res, const0_rtx));
   DONE;
 })
 

easily though even using scalar operations for the add would be possible.

reduc_plus_scal_v8df uses ix86_expand_reduc which seems to use
full-width instructions throughout.  I recently changed the vectorizer
to open-code tem%ymm = lowpart(%zmm) + highpart(%zmm);
tem%xmm = lowpart(tem%ymm) + highpart(tem%ymm); reduce(tem%xmm) which
is better for all cores so I wonder if ix86_expand_reduc should follow
that scheme.  As said in a related PR the backend is in full control
of the final reduction sequence used when defining reduc_plus_scal_
which IMHO is good and we likely should have tuning specific patterns
in case there isn't a one-fits-all one.

Thanks,
Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: ((X /[ex] A) +- B) * A --> X +- A * B

2018-10-01 Thread Marc Glisse

On Mon, 1 Oct 2018, Richard Biener wrote:


On Sat, Sep 29, 2018 at 1:35 PM Marc Glisse  wrote:


Hello,

I noticed quite ugly code from both testcases. This transformation does
not fix either, but it helps a bit.


I'm curious why you chose to restrict to INTEGER_CST A and B?
Is that because of the case when (X / [ex] A) +- B evaluates to zero
but A * B overflows?  Can that ever happen?  Isn't it enough to know
that A isn't -1?  That is, can we use expr_not_equal_to or friends
to put constraints on possibly non-constant A/B?


For A, I don't remember seeing a divexact with non-constant denominator in 
gcc. For B, constants are what I was seeing in testcases, I was even 
tempted to implement it only for B==1 which is simpler.


At first, I only had the version without casts, and for that I needed to 
be able to check for overflow, so constants. Now that I have a version 
that casts to unsigned, it would be valid with arbitrary A and B indeed. 
There are also a lot of casts making my head hurt (we may need one more 
for the last @1 if it isn't a constant anymore).


I was also thinking of things like ((X/[ex]3)+1)*6, but that doesn't occur 
as often.



Otherwise the patch is of course OK and the above would just improve
it.


I'll commit it and see if I can find some time...

--
Marc Glisse


Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

2018-10-01 Thread Qing Zhao
Hi, Martin,

I have studied a little more on

https://github.com/marxin/kgraft-analysis-tool/blob/master/README.md 


in the Section “Usages”, from the example, we can see:

the tool will report a list of affected functions for a function that will be 
patched.
In this list, it includes all callers of the patched function, and the cloned 
functions from the patched function due to ipa const-propogation or ipa sra. 

My question:

what’s the current action to handle the cloned functions from the patched 
function due to ipa const-proposation or ipa sra, etc?

since those cloned functions are NOT in the source code level, how to generate 
the patches for the cloned functions? how to guarantee that after 
the patched function is changed, the same ipa const-propogation or ipa sra will 
still happened? 

a little confused here.

thanks.

Qing
> On Sep 27, 2018, at 7:19 AM, Martin Jambor  wrote:
> 
> Hi,
> 
> (this message is a part of the thread originating with
> https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01018.html)
> 
> On Thu, Sep 27 2018, Jan Hubicka wrote:
 If you make this to be INTERPOSABLE (which means it can be replaced by 
 different
 implementation by linker and that is probably what we want for live 
 patching)
 then also inliner, ipa-sra and other optimization will give up on these.
>>> 
>>> do you suggest that to set the global function as AVAIL_INTERPOSABLE when 
>>> -finline-only-static 
>>> is present? then we should avoid all issues?
>> 
>> It seems to be reasonable direction I think, because it is what really 
>> happens
>> (well AVAIL_INTERPOSABLE still does not assume that the interposition will
>> happen at runtime, but it is an approximation and we may introduce something 
>> like
>> AVAIL_RUNTIME_INTERPOSABLE if there is need for better difference).
>> I wonder if -finline-only-static is good name for the flag though, because it
>> does a lot more than that.  Maybe something like -flive-patching?
>> How much is this all tied to one particular implementation of the feature?
> 
> We have just had a quick discussion with two upstream maintainers of
> Linux kernel live-patching about this and the key points were:
> 
> 1. SUSE live-patch creators (and I assume all that use the upstream
>   live-patching method) use Martin Liska's (somewhat under-documented)
>   -fdump-ipa-clones option and a utility he wrote
>   (https://github.com/marxin/kgraft-analysis-tool) to deal with all
>   kinds of inlining, IPA-CP and generally all IPA optimizations that
>   internally create a clone.  The tool tells them what happened and
>   also lists all callers that need to be live-patched.
> 
> 2. However, there is growing concern about other IPA analyses that do
>   not create a clone but still affect code generation in other
>   functions.  Kernel developers have identified and disabled IPA-RA but
>   there is more of them such as IPA-modref analysis, stack alignment
>   propagation and possibly quite a few others which extract information
>   from one function and use it a caller or perhaps even some
>   almost-unrelated functions (such as detection of read-only and
>   write-only static global variables).
> 
>   The kernel live-patching community would welcome if GCC had an option
>   that could disable all such optimizations/analyses for which it
>   cannot provide a list of all affected functions (i.e. which ones need
>   to be live-patched if a particular function is).
> 
>   I assume this is orthogonal to the proposed -finline-only-static
>   option, but the above approach seems superior in all respects.
> 
> 3. The community would also like to be involved in these discussions,
>   and therefore I am adding live-patch...@vger.kernel.org to CC.  On a
>   related note, they will also have a live-patching mini-summit at the
>   Linux Plumbers conference in Vancouver in November where they plan to
>   discuss what they would like GCC to provide.
> 
> Thanks,
> 
> Martin
> 



Re: [PATCH][C]/[C++] Remove DECL_FROM_INLINE use

2018-10-01 Thread Joseph Myers
On Mon, 1 Oct 2018, Richard Biener wrote:

> This patch removes checks of DECL_FROM_INLINE from the respective
> -Wshadow code of the C/C++ FE.  I noticed those when looking after
> DECL_ABSTRACT_ORIGIN uses.  Those checks may be from times where
> we did inlining very early?
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu with no regression.
> 
> OK for trunk?

The C change is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[libiberty] Use pipe inside pex_run

2018-10-01 Thread Nathan Sidwell

Ian,
this patch implements the pipe error channel you suggested a while back. 
 Before the (v)fork we create a pipe and set it up for CLOEXEC.  If 
exec failure happens in the child, we write errno & the fnname to the 
pipe.  In the parent we attempt to read the pipe.  We'll get EOF if the 
child was successful.  Otherwise we get the error and fn, which we pass 
back to our caller.  As both child and parent are in the same address 
space, it's perfectly fine to pass string literal addresses down the pipe.


An example of the difference is the behaviour of:
 ./xg++ -Bbogus ...

before the patch we get:
xg++: error trying to exec 'cc1plus': execvp: No such file or directory

with the patch we get:
xg++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or 
directory

compilation terminated.

Martin has been kind enough to test this patch in his profiled-bootstrap 
build, and I've tested on AIX (where vfork is fork) as well asx86_64-linux.


nathan

--
Nathan Sidwell
2018-10-01  Nathan Sidwell  

	* configure.ac (checkfuncs): Add pipe2.
	* config.in, configure: Rebuilt.
	* pex-unix.c (pex_unix_exec_child): Comminicate errors from child
	to parent with a pipe, when possible.

Index: config.in
===
--- config.in	(revision 264744)
+++ config.in	(working copy)
@@ -195,6 +195,9 @@
 /* Define to 1 if you have the `on_exit' function. */
 #undef HAVE_ON_EXIT
 
+/* Define to 1 if you have the `pipe2' function. */
+#undef HAVE_PIPE2
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_PROCESS_H
 
Index: configure
===
--- configure	(revision 264744)
+++ configure	(working copy)
@@ -5727,7 +5727,7 @@ funcs="$funcs setproctitle"
 vars="sys_errlist sys_nerr sys_siglist"
 
 checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
- getsysinfo gettimeofday on_exit psignal pstat_getdynamic pstat_getstatic \
+ getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic pstat_getstatic \
  realpath setrlimit sbrk spawnve spawnvpe strerror strsignal sysconf sysctl \
  sysmp table times wait3 wait4"
 
@@ -5743,7 +5743,7 @@ if test "x" = "y"; then
 index insque \
 memchr memcmp memcpy memmem memmove memset mkstemps \
 on_exit \
-psignal pstat_getdynamic pstat_getstatic putenv \
+pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
 random realpath rename rindex \
 sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe \
  stpcpy stpncpy strcasecmp strchr strdup \
Index: configure.ac
===
--- configure.ac	(revision 264744)
+++ configure.ac	(working copy)
@@ -391,7 +391,7 @@ funcs="$funcs setproctitle"
 vars="sys_errlist sys_nerr sys_siglist"
 
 checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
- getsysinfo gettimeofday on_exit psignal pstat_getdynamic pstat_getstatic \
+ getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic pstat_getstatic \
  realpath setrlimit sbrk spawnve spawnvpe strerror strsignal sysconf sysctl \
  sysmp table times wait3 wait4"
 
@@ -407,7 +407,7 @@ if test "x" = "y"; then
 index insque \
 memchr memcmp memcpy memmem memmove memset mkstemps \
 on_exit \
-psignal pstat_getdynamic pstat_getstatic putenv \
+pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
 random realpath rename rindex \
 sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe \
  stpcpy stpncpy strcasecmp strchr strdup \
Index: pex-unix.c
===
--- pex-unix.c	(revision 264744)
+++ pex-unix.c	(working copy)
@@ -569,6 +569,38 @@ pex_unix_exec_child (struct pex_obj *obj
 		 int toclose, const char **errmsg, int *err)
 {
   pid_t pid = -1;
+  /* Tuple to communicate error from child to parent.  We can safely
+ transfer string literal pointers as both run with identical
+ address mappings.  */
+  struct fn_err 
+  {
+const char *fn;
+int err;
+  };
+  volatile int do_pipe = 0;
+  volatile int pipes[2]; /* [0]:reader,[1]:writer.  */
+#ifdef O_CLOEXEC
+  do_pipe = 1;
+#endif
+  if (do_pipe)
+{
+#ifdef HAVE_PIPE2
+  if (pipe2 ((int *)pipes, O_CLOEXEC))
+	do_pipe = 0;
+#else
+  if (pipe ((int *)pipes))
+	do_pipe = 0;
+  else
+	{
+	  if (fcntl (pipes[1], F_SETFD, FD_CLOEXEC) == -1)
+	{
+	  close (pipes[0]);
+	  close (pipes[1]);
+	  do_pipe = 0;
+	}
+	}
+#endif
+}
 
   /* We declare these to be volatile to avoid warnings from gcc about
  them being clobbered by vfork.  */
@@ -579,8 +611,9 @@ pex_unix_exec_child (struct pex_obj *obj
  This clobbers the parent's environ so we need to restore it.
  It would be nice to use one of the exec* functions that takes an
  environment as a parameter, but that may have portability
- issues.   */
-  char **sa

Re: [libiberty] Use pipe inside pex_run

2018-10-01 Thread Ian Lance Taylor via gcc-patches
On Mon, Oct 1, 2018 at 10:53 AM, Nathan Sidwell  wrote:
> Ian,
> this patch implements the pipe error channel you suggested a while back.
> Before the (v)fork we create a pipe and set it up for CLOEXEC.  If exec
> failure happens in the child, we write errno & the fnname to the pipe.  In
> the parent we attempt to read the pipe.  We'll get EOF if the child was
> successful.  Otherwise we get the error and fn, which we pass back to our
> caller.  As both child and parent are in the same address space, it's
> perfectly fine to pass string literal addresses down the pipe.
>
> An example of the difference is the behaviour of:
>  ./xg++ -Bbogus ...
>
> before the patch we get:
> xg++: error trying to exec 'cc1plus': execvp: No such file or directory
>
> with the patch we get:
> xg++: fatal error: cannot execute ‘cc1plus’: execvp: No such file or
> directory
> compilation terminated.
>
> Martin has been kind enough to test this patch in his profiled-bootstrap
> build, and I've tested on AIX (where vfork is fork) as well asx86_64-linux.

Thanks for doing this.

This is OK.

Ian


[libstdc++,doc] adjust a link in doc/xml/manual/allocator.xml

2018-10-01 Thread Gerald Pfeifer
This just moved from http to https, and they put a redirect in place,
so pretty straightforward.  Applied.

2018-10-01  Gerald Pfeifer  

* doc/xml/manual/allocator.xml: Adjust link to "Reconsidering
Custom Memory Allocation".

Index: doc/xml/manual/allocator.xml
===
--- doc/xml/manual/allocator.xml(revision 264760)
+++ doc/xml/manual/allocator.xml(revision 264761)
@@ -531,7 +531,7 @@
   
   
http://www.w3.org/1999/xlink";
- 
xlink:href="http://people.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf";>
+ 
xlink:href="https://people.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf";>
   Reconsidering Custom Memory Allocation

   



Re: [PATCH][C]/[C++] Remove DECL_FROM_INLINE use

2018-10-01 Thread Jason Merrill
OK.
On Mon, Oct 1, 2018 at 1:14 PM Joseph Myers  wrote:
>
> On Mon, 1 Oct 2018, Richard Biener wrote:
>
> > This patch removes checks of DECL_FROM_INLINE from the respective
> > -Wshadow code of the C/C++ FE.  I noticed those when looking after
> > DECL_ABSTRACT_ORIGIN uses.  Those checks may be from times where
> > we did inlining very early?
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu with no regression.
> >
> > OK for trunk?
>
> The C change is OK.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Fix pretty printers in _GLIBCXX_DEBUG mode

2018-10-01 Thread François Dumont

On 09/28/2018 02:01 PM, Jonathan Wakely wrote:

On 25/09/18 22:11 +0200, François Dumont wrote:


I guess it must have something to do with the [] but as I escaped 
both I don't understand what's wrong.


You might need to escape them twice, or more ... so that TCL doesn't
try to handle them as special characters, and then the regex engine
doesn't treat them as special characters either.


Ok, with some additionals escape characters it now succeeds. So now the 
question is: ok to commit ?


It makes the tests a little bit more complicated so I would understand 
that you reject it. At the same time I regularly see those tests 
reported as in failure so I guess we want them to succeed and it is the 
only way I found to fix them. Another option would be to try to hide 
'__debug::' from a whatis call but I don't want to hide it from our 
users just to make our tests pass.


    * python/libstdcxx/v6/printers.py (add_one_template_type_printer):
    Add type printer for container types in std::__debug namespace.
    * testsuite/lib/gdb-test.exp (whatis-regexp-test): New.
    (gdb-tests): Use distinct parameters for the type of test and use 
regex.

    (gdb-test): Check for regex test even if 'whatis' test.
    * testsuite/libstdc++-prettyprinters/80276.cc: Adapt for _GLIBCXX_DEBUG
    mode.
    * testsuite/libstdc++-prettyprinters/cxx11.cc: Likewise.
    * testsuite/libstdc++-prettyprinters/cxx17.cc: Likewise.
    * testsuite/libstdc++-prettyprinters/libfundts.cc: Likewise.
    * testsuite/libstdc++-prettyprinters/simple11.cc: Likewise.
    * testsuite/libstdc++-prettyprinters/whatis.cc: Likewise.
    * testsuite/libstdc++-prettyprinters/whatis2.cc: Likewise.

Tested under Linux x86_64 debug and normal modes.

François

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index afe1b325d87..b471cb04941 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1479,6 +1479,11 @@ def add_one_template_type_printer(obj, name, defargs):
 """
 printer = TemplateTypePrinter('std::'+name, defargs)
 gdb.types.register_type_printer(obj, printer)
+
+# Add type printer for same type in debug namespace:
+printer = TemplateTypePrinter('std::__debug::'+name, defargs)
+gdb.types.register_type_printer(obj, printer)
+
 if _versioned_namespace:
 # Add second type printer for same type in versioned namespace:
 ns = 'std::' + _versioned_namespace
diff --git a/libstdc++-v3/testsuite/lib/gdb-test.exp b/libstdc++-v3/testsuite/lib/gdb-test.exp
index 5ee693a8ee7..d70b6a4ac78 100644
--- a/libstdc++-v3/testsuite/lib/gdb-test.exp
+++ b/libstdc++-v3/testsuite/lib/gdb-test.exp
@@ -55,7 +55,7 @@ proc get_line_number {filename marker} {
 proc note-test {var result} {
 global gdb_tests
 
-lappend gdb_tests $var $result 0
+lappend gdb_tests $var $result print 0
 }
 
 # A test that uses a regular expression.  This is like note-test, but
@@ -64,14 +64,22 @@ proc note-test {var result} {
 proc regexp-test {var result} {
 global gdb_tests
 
-lappend gdb_tests $var $result 1
+lappend gdb_tests $var $result print 1
 }
 
 # A test of 'whatis'.  This tests a type rather than a variable.
 proc whatis-test {var result} {
 global gdb_tests
 
-lappend gdb_tests $var $result whatis
+lappend gdb_tests $var $result whatis 0
+}
+
+# A test of 'whatis' that uses a regular expression. This tests a type rather
+# than a variable.
+proc whatis-regexp-test {var result} {
+global gdb_tests
+
+lappend gdb_tests $var $result whatis 1
 }
 
 # Utility for testing variable values using gdb, invoked via dg-final.
@@ -136,13 +144,14 @@ proc gdb-test { marker {selector {}} {load_xmethods 0} } {
 puts $fd "info share"
 
 set count 0
-foreach {var result kind} $gdb_tests {
+foreach {var result kind rexp} $gdb_tests {
 	incr count
 	set gdb_var($count) $var
 	set gdb_expected($count) $result
 	if {$kind == "whatis"} {
 	if {$do_whatis_tests} {
 		set gdb_is_type($count) 1
+		set gdb_is_regexp($count) $rexp
 		set gdb_command($count) "whatis $var"
 	} else {
 	unsupported "$testname"
@@ -151,7 +160,7 @@ proc gdb-test { marker {selector {}} {load_xmethods 0} } {
 	}
 	} else {
 	set gdb_is_type($count) 0
-	set gdb_is_regexp($count) $kind
+	set gdb_is_regexp($count) $rexp
 	set gdb_command($count) "print $var"
 	}
 	puts $fd $gdb_command($count)
@@ -179,9 +188,9 @@ proc gdb-test { marker {selector {}} {load_xmethods 0} } {
 		if {$expect_out(1,string) != "type"} {
 		error "gdb failure"
 		}
-		set match [expr {![string compare $first \
- $gdb_expected($test_counter)]}]
-	} elseif {$gdb_is_regexp($test_counter)} {
+	}
+
+	if {$gdb_is_regexp($test_counter)} {
 		set match [regexp -- $gdb_expected($test_counter) $first]
 	} else {
 		set match [expr {![string compare $first \
diff --git a/libstdc++-v3/testsuite/

Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread Ian Lance Taylor
On Wed, Sep 26, 2018 at 3:54 AM, Andreas Schwab  wrote:
> All execution tests are now failing with "fatal error: impossible call
> to aeshashbody".

Thanks.  Fixed by this patch, which adds AES hash code for arm64 using
intrinsics.  Bootstrapped and tested on x86_64-pc-linux-gnu and
aarch4-unknown-linux-gnu.  Some other aarch64 tests failed; I'm not
sure if they failed before or not.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 264690)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-f4a224ec481957ca4f14d0e8cc4fe59cc95b3a49
+013a9e68c9a31f888733d46182d19f9e5d956f27
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/aeshash.c
===
--- libgo/runtime/aeshash.c (revision 264648)
+++ libgo/runtime/aeshash.c (working copy)
@@ -573,13 +573,412 @@ uintptr aeshashbody(void* p, uintptr see
 
 #endif // !defined(__x86_64__)
 
-#else // !defined(__i386__) && !defined(__x86_64__) || 
!defined(HAVE_AS_X86_AES)
+#elif defined(__aarch64__)
+
+// Undefine some identifiers that we pick up from the Go runtime package that
+// are used in arm_neon.h.
+
+#undef t1
+#undef tx
+#undef t2
+#undef t3
+#undef t4
+#undef t5
+
+#include 
+
+// Force appropriate CPU level.  We won't call here unless the CPU
+// supports it.
+
+#pragma GCC target("+crypto")
+
+// The arm64 version of aeshashbody.
+
+uintptr aeshashbody(void* p, uintptr seed, uintptr size, Slice aeskeysched) {
+   uint8x16_t *pseed;
+   uint32x4_t vinit32;
+   uint8x16_t vinit;
+   uint8x16_t vseed, vseed2, vseed3, vseed4;
+   uint8x16_t vseed5, vseed6, vseed7, vseed8;
+   uint8x16_t vval, vval2, vval3, vval4;
+   uint8x16_t vval5, vval6, vval7, vval8;
+   uint8x16_t vvalLoop, vvalLoop2, vvalLoop3, vvalLoop4;
+   uint8x16_t vvalLoop5, vvalLoop6, vvalLoop7, vvalLoop8;
+   uint8x16x2_t avval2;
+   uint8x16x3_t avseed3;
+
+   pseed = (uint8x16_t*)(aeskeysched.__values);
+
+   // Combined hash seed and length.
+   vinit32 = vdupq_n_u32(0);
+   vinit32[0] = (uint32)seed;
+   vinit32[1] = (uint32)size;
+   vinit = vreinterpretq_u8_u32(vinit32);
+
+   // Mix in per-process seed.
+   vseed = vaeseq_u8(*pseed, vinit);
+   ++pseed;
+   // Scramble seed.
+   vseed = vaesmcq_u8(vseed);
+
+   if (size <= 16) {
+   if (size == 0) {
+   // Return 64 bits of scrambled input seed.
+   return vreinterpretq_u64_u8(vseed)[0];
+   } else if (size < 16) {
+   vval = vreinterpretq_u8_u32(vdupq_n_u32(0));
+   if ((size & 8) != 0) {
+   vval = 
vreinterpretq_u8_u64(vld1q_lane_u64((uint64_t*)(p), vreinterpretq_u64_u8(vval), 
0));
+   p = (void*)((uint64_t*)(p) + 1);
+   }
+   if ((size & 4) != 0) {
+   vval = 
vreinterpretq_u8_u32(vld1q_lane_u32((uint32_t*)(p), vreinterpretq_u32_u8(vval), 
2));
+   p = (void*)((uint32_t*)(p) + 1);
+   }
+   if ((size & 2) != 0) {
+   vval = 
vreinterpretq_u8_u16(vld1q_lane_u16((uint16_t*)(p), vreinterpretq_u16_u8(vval), 
6));
+   p = (void*)((uint16_t*)(p) + 1);
+   }
+   if ((size & 1) != 0) {
+   vval = vld1q_lane_u8((uint8*)(p), vval, 14);
+   }
+   } else {
+   vval = *(uint8x16_t*)(p);
+   }
+   vval = vaeseq_u8(vval, vseed);
+   vval = vaesmcq_u8(vval);
+   vval = vaeseq_u8(vval, vseed);
+   vval = vaesmcq_u8(vval);
+   vval = vaeseq_u8(vval, vseed);
+   return vreinterpretq_u64_u8(vval)[0];
+   } else if (size <= 32) {
+   // Make a second seed.
+   vseed2 = vaeseq_u8(*pseed, vinit);
+   vseed2 = vaesmcq_u8(vseed2);
+   vval = *(uint8x16_t*)(p);
+   vval2 = *(uint8x16_t*)((char*)(p) + (size - 16));
+
+   vval = vaeseq_u8(vval, vseed);
+   vval = vaesmcq_u8(vval);
+   vval2 = vaeseq_u8(vval2, vseed2);
+   vval2 = vaesmcq_u8(vval2);
+
+   vval = vaeseq_u8(vval, vseed);
+   vval = vaesmcq_u8(vval);
+   vval2 = vaeseq_u8(vval2, vseed2);
+   vval2 = vaesmcq_u8(vval2);
+
+   vval = vaeseq_u8(vval, vseed);
+   vval2 = vaeseq_u8(vval2, vseed2);
+
+   vval ^= vval2;
+
+   return vreinterpretq_u64_u8(vval)[0];
+   } else if (size

Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread Ian Lance Taylor
On Wed, Sep 26, 2018 at 7:50 AM, H.J. Lu  wrote:
> On Mon, Sep 24, 2018 at 2:46 PM, Ian Lance Taylor  wrote:
>> I've committed a patch to update libgo to the 1.11 release.  As usual
>> for these updates, the patch is too large to attach to this e-mail
>> message.  I've attached some of the more relevant directories.  This
>> update required some minor patches to the gotools directory and the Go
>> testsuite, also included here.  Bootstrapped and ran Go testsuite on
>> x86_64-pc-linux-gnu.  Committed to mainline.
>>
>> Ian
>>
>> 2018-09-24  Ian Lance Taylor  
>>
>> * Makefile.am (mostlyclean-local): Run chmod on check-go-dir to
>> make sure it is writable.
>> (check-go-tools): Likewise.
>> (check-vet): Copy internal/objabi to check-vet-dir.
>> * Makefile.in: Rebuild.
>
> When building with -mx32, I got
>
> /export/gnu/import/git/sources/gcc/libgo/go/runtime/malloc.go:309:44:
> error: integer constant overflow
> 309 |  arenaBaseOffset uintptr = sys.GoarchAmd64 * (1 << 47)
> |^


Thanks.  I fixed this problem by switching to using amd64p32 on x32.
Bootstrapped and ran testsuite on x86_64-pc-linux-gnu using
--with-multilib-list=m64,m32,mx32.  However, I ran this on a kernel
without x32 support, so while building succeeds, I couldn't actually
run any tests.  Let me know how they do.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 264771)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-013a9e68c9a31f888733d46182d19f9e5d956f27
+2f56d51c6b3104242613c74b02fa6c63a2fe16c5
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/configure.ac
===
--- libgo/configure.ac  (revision 264648)
+++ libgo/configure.ac  (working copy)
@@ -252,8 +252,13 @@ changequote([,])dnl
 #ifdef __x86_64__
 #error 64-bit
 #endif],
-[GOARCH=386],
-[GOARCH=amd64])
+   [GOARCH=386],
+   AC_COMPILE_IFELSE([
+#ifdef __ILP32__
+#error x32
+#endif],
+   [GOARCH=amd64],
+   [GOARCH=amd64p32]))
 ;;
   ia64-*-*)
 GOARCH=ia64
Index: libgo/go/hash/crc32/crc32_amd64p32.go
===
--- libgo/go/hash/crc32/crc32_amd64p32.go   (revision 264648)
+++ libgo/go/hash/crc32/crc32_amd64p32.go   (working copy)
@@ -2,6 +2,8 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
+// +build ignore
+
 package crc32
 
 import "internal/cpu"
Index: libgo/go/internal/syscall/unix/getrandom_linux_amd64p32.go
===
--- libgo/go/internal/syscall/unix/getrandom_linux_amd64p32.go  (nonexistent)
+++ libgo/go/internal/syscall/unix/getrandom_linux_amd64p32.go  (working copy)
@@ -0,0 +1,9 @@
+// Copyright 2018 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+
+package unix
+
+// Linux getrandom system call number.
+// See GetRandom in getrandom_linux.go.
+const randomTrap uintptr = 0x4000 + 318
Index: libgo/go/runtime/lfstack_32bit.go
===
--- libgo/go/runtime/lfstack_32bit.go   (revision 264648)
+++ libgo/go/runtime/lfstack_32bit.go   (working copy)
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build 386 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le nios2 ppc 
s390 sh shbe sparc
+// +build 386 amd64p32 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le 
nios2 ppc s390 sh shbe sparc
 
 package runtime
 
Index: libgo/goarch.sh
===
--- libgo/goarch.sh (revision 264648)
+++ libgo/goarch.sh (working copy)
@@ -57,10 +57,15 @@ case $goarch in
defaultphyspagesize=8192
pcquantum=4
;;
-amd64 | amd64p32)
+amd64)
family=AMD64
hugepagesize="1 << 21"
;;
+amd64p32)
+   family=AMD64
+   hugepagesize="1 << 21"
+   ptrsize=4
+   ;;
 arm | armbe)
family=ARM
cachelinesize=32


Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread H.J. Lu
On Mon, Oct 1, 2018 at 1:18 PM, Ian Lance Taylor  wrote:
> On Wed, Sep 26, 2018 at 7:50 AM, H.J. Lu  wrote:
>> On Mon, Sep 24, 2018 at 2:46 PM, Ian Lance Taylor  wrote:
>>> I've committed a patch to update libgo to the 1.11 release.  As usual
>>> for these updates, the patch is too large to attach to this e-mail
>>> message.  I've attached some of the more relevant directories.  This
>>> update required some minor patches to the gotools directory and the Go
>>> testsuite, also included here.  Bootstrapped and ran Go testsuite on
>>> x86_64-pc-linux-gnu.  Committed to mainline.
>>>
>>> Ian
>>>
>>> 2018-09-24  Ian Lance Taylor  
>>>
>>> * Makefile.am (mostlyclean-local): Run chmod on check-go-dir to
>>> make sure it is writable.
>>> (check-go-tools): Likewise.
>>> (check-vet): Copy internal/objabi to check-vet-dir.
>>> * Makefile.in: Rebuild.
>>
>> When building with -mx32, I got
>>
>> /export/gnu/import/git/sources/gcc/libgo/go/runtime/malloc.go:309:44:
>> error: integer constant overflow
>> 309 |  arenaBaseOffset uintptr = sys.GoarchAmd64 * (1 << 47)
>> |^
>
>
> Thanks.  I fixed this problem by switching to using amd64p32 on x32.
> Bootstrapped and ran testsuite on x86_64-pc-linux-gnu using
> --with-multilib-list=m64,m32,mx32.  However, I ran this on a kernel
> without x32 support, so while building succeeds, I couldn't actually
> run any tests.  Let me know how they do.
>
> Ian

I am giving it try.

Thanks.

-- 
H.J.


[PATCH] detect attribute mismatches in alias declarations (PR 81824)

2018-10-01 Thread Martin Sebor

PR 81824 is a request to detect and diagnose alias declarations
with less restrictive attributes than those of their targets.
I promised I'd implement this for GCC 9 so with the end of
stage 1 approaching I figured it was about time to post my
attempt at this enhancement.  I expect it to need tweaking
to make it easier to adopt.

The solution reuses for this purpose the -Wmissing-attributes
warning introduced in GCC 8 for C++.  It goes beyond the C++
warning and also beyond what Joseph asked for and detects both
less and more restrictive attributes. (The latter triggers
-Wattributes but with the growing number of distinct checkes
in -Wattributes it might be worth thinking about splitting
some out into new options.)

Testing the patch with Glibc triggers thousands of warnings of
both kinds.  After reviewing a small subset it became apparent
that dealing with the inconsistencies on such a scale calls for
a convenient mechanism to (at at a minimum) automatically copy
attributes between declarations, similar to how __typeof__ makes
it possible to use the type of an existing declaration as that
of a new one.  The patch helps with this by introducing a new
attribute called copy.  The attribute copies attributes from
one declaration (or type) to another.  The attribute doesn't
resolve all the warnings but it helps.

The class of warnings I noticed that can't be so easily handled
are due to inconsistencies between ifuncs and their resolvers.
One way to solve it might be to have resolvers automatically
"inherit" all the attributes of their targets (and enhance
GCC to warn for violations).  Another way might be to expect
resolvers to be explicitly declared with attribute copy to copy
the attributes of all the targets (and also warn for violations).

In the patch I have hardcoded a few attributes that don't get
copied I call those linkage and visibility attributes.  I'm not
too happy about hardcoding things like this but the only other
alternative I could think of was parameterizing the copy
attribute on the set of other attributes not to copy, but since
those would almost always be the same as the harcoded ones, it
didn't seem worthwhile.

Martin

PS With the attached GCC and Glibc patches I get the following
breakdown of warnings in Glibc (the numbers are the total count,
the number of unique instances, and the number of files they are
in):

  DiagnosticCount   UniqueFiles
  -Wattributes   1743  724  599
  -Wmissing-attributes 90   24   22

The -Wattributes are of the sort:

version.c:54:37: warning: ‘gnu_get_libc_release’ specifies more 
restrictive attribute than its target ‘__gnu_get_libc_release’: 
‘nothrow’ [-Wattributes]


(The absence of the nothrow attribute accounts for the majority
of the warnings.)

An example of the -Wmissing-attributes instance is:

./../include/libc-symbols.h:534:26: warning: ‘__EI___redirect_strcat’ 
specifies less restrictive attributes than its target ‘strcat’: ‘leaf’, 
‘nonnull’ [-Wmissing-attributes]


This is the ifunc resolver mismatch I mention above.
diff --git a/include/libc-symbols.h b/include/libc-symbols.h
index 8b9273c..b0fb728 100644
--- a/include/libc-symbols.h
+++ b/include/libc-symbols.h
@@ -143,7 +143,7 @@
If weak aliases are not available, this defines a strong alias.  */
 # define weak_alias(name, aliasname) _weak_alias (name, aliasname)
 # define _weak_alias(name, aliasname) \
-  extern __typeof (name) aliasname __attribute__ ((weak, alias (#name)));
+  extern __typeof (name) aliasname __attribute__ ((weak, alias (#name), copy (name)));
 
 /* Same as WEAK_ALIAS, but mark symbol as hidden.  */
 # define weak_hidden_alias(name, aliasname) \
@@ -532,7 +532,7 @@ for linking")
 #  define __hidden_ver1(local, internal, name) \
   extern __typeof (name) __EI_##name __asm__(__hidden_asmname (#internal)); \
   extern __typeof (name) __EI_##name \
-	__attribute__((alias (__hidden_asmname (#local
+__attribute__((alias (__hidden_asmname (#local)), copy (name)))
 #  define hidden_ver(local, name)	__hidden_ver1(local, __GI_##name, name);
 #  define hidden_data_ver(local, name)	hidden_ver(local, name)
 #  define hidden_def(name)		__hidden_ver1(__GI_##name, name, name);
@@ -545,7 +545,8 @@ for linking")
 #  define __hidden_nolink1(local, internal, name, version) \
   __hidden_nolink2 (local, internal, name, version)
 #  define __hidden_nolink2(local, internal, name, version) \
-  extern __typeof (name) internal __attribute__ ((alias (#local))); \
+  extern __typeof (name) internal __attribute__ ((alias (#local), \
+		copy (name)));	\
   __hidden_nolink3 (local, internal, #name "@" #version)
 #  define __hidden_nolink3(local, internal, vername) \
   __asm__ (".symver " #internal ", " vername);
PR middle-end/81824 - Warn for missing attributes with function aliases

gcc/c-family/ChangeLog:

	PR middle-end/81824
	* c-attribs.c (handle_copy_attribute_impl): New function.
	* c-attribs

[PATCH] libstdc++: Remove unused define

2018-10-01 Thread Bernhard Reutner-Fischer
__NO_STRING_INLINES was removed from uClibc around 2004 so has no
effect.

Ok for trunk?

libstdc++-v3/ChangeLog:

2018-10-01  Bernhard Reutner-Fischer  

* config/os/uclibc/os_defines.h (__NO_STRING_INLINES): Delete.
---
 libstdc++-v3/config/os/uclibc/os_defines.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/libstdc++-v3/config/os/uclibc/os_defines.h 
b/libstdc++-v3/config/os/uclibc/os_defines.h
index 03a7273d5dc..bcc47d4d589 100644
--- a/libstdc++-v3/config/os/uclibc/os_defines.h
+++ b/libstdc++-v3/config/os/uclibc/os_defines.h
@@ -38,7 +38,4 @@
 
 #include 
 
-// We must not see the optimized string functions GNU libc defines.
-#define __NO_STRING_INLINES
-
 #endif
-- 
2.19.0



[PING] [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-10-01 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

On 09/21/2018 11:13 AM, Martin Sebor wrote:

On 09/17/2018 07:30 PM, Jeff Law wrote:

On 8/28/18 6:12 PM, Martin Sebor wrote:

Sadly, dstbase is the PARM_DECL for d.  That's where things are going
"wrong".  Not sure why you're getting the PARM_DECL in that case.  I'd
debug get_addr_base_and_unit_offset to understand what's going on.
Essentially you're getting different results of
get_addr_base_and_unit_offset in a case where they arguably should be
the same.


Probably get_attr_nonstring_decl has the same "mistake" and returns
the PARM_DECL instead of the SSA name pointer.  So we're comparing
apples and oranges here.


Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is
intentional but the function need not (perhaps should not)
also set *REF to it.



Yeah:

/* If EXPR refers to a character array or pointer declared attribute
   nonstring return a decl for that array or pointer and set *REF to
   the referenced enclosing object or pointer.  Otherwise returns
   null.  */

tree
get_attr_nonstring_decl (tree expr, tree *ref)
{
  tree decl = expr;
  if (TREE_CODE (decl) == SSA_NAME)
{
  gimple *def = SSA_NAME_DEF_STMT (decl);

  if (is_gimple_assign (def))
{
  tree_code code = gimple_assign_rhs_code (def);
  if (code == ADDR_EXPR
  || code == COMPONENT_REF
  || code == VAR_DECL)
decl = gimple_assign_rhs1 (def);
}
  else if (tree var = SSA_NAME_VAR (decl))
decl = var;
}

  if (TREE_CODE (decl) == ADDR_EXPR)
decl = TREE_OPERAND (decl, 0);

  if (ref)
*ref = decl;

I see a lot of "magic" here again in the attempt to "propagate"
a nonstring attribute.


That's the function's purpose: to look for the attribute.  Is
there a better way to do this?


Note

foo (char *p __attribute__(("nonstring")))
{
  p = "bar";
  strlen (p); // or whatever is necessary to call
get_attr_nonstring_decl
}

is perfectly valid and p as passed to strlen is _not_ nonstring(?).


I don't know if you're saying that it should get a warning or
shouldn't.  Right now it doesn't because the strlen() call is
folded before we check for nonstring.

I could see an argument for diagnosing it but I suspect you
wouldn't like it because it would mean more warning from
the folder.  I could also see an argument against it because,
as you said, it's safe.

If you take the assignment to p away then a warning is issued,
and that's because p is declared with attribute nonstring.
That's also why get_attr_nonstring_decl looks at SSA_NAME_VAR.


I think in your code comparing bases you want to look at the _original_
argument to the string function rather than what
get_attr_nonstring_decl
returned as ref.


I've adjusted get_attr_nonstring_decl() to avoid setting *REF
to SSA_NAME_VAR.  That let me remove the GIMPLE_NOP code from
the patch.  I've also updated the comment above SSA_NAME_VAR
to clarify its purpose per Jeff's comments.

Attached is an updated revision with these changes.

Martin

gcc-87028.diff

PR tree-optimization/87028 - false positive -Wstringop-truncation
strncpy with global variable source string
gcc/ChangeLog:

PR tree-optimization/87028
* calls.c (get_attr_nonstring_decl): Avoid setting *REF to
SSA_NAME_VAR.
* gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding
when statement doesn't belong to a basic block.
* tree.h (SSA_NAME_VAR): Update comment.
* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify.

gcc/testsuite/ChangeLog:

PR tree-optimization/87028
* c-c++-common/Wstringop-truncation.c: Remove xfails.
* gcc.dg/Wstringop-truncation-5.c: New test.




Index: gcc/calls.c
===
--- gcc/calls.c(revision 263928)
+++ gcc/calls.c(working copy)
@@ -1503,6 +1503,7 @@ tree
 get_attr_nonstring_decl (tree expr, tree *ref)
 {
   tree decl = expr;
+  tree var = NULL_TREE;
   if (TREE_CODE (decl) == SSA_NAME)
 {
   gimple *def = SSA_NAME_DEF_STMT (decl);
@@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref)
   || code == VAR_DECL)
 decl = gimple_assign_rhs1 (def);
 }
-  else if (tree var = SSA_NAME_VAR (decl))
-decl = var;
+  else
+var = SSA_NAME_VAR (decl);
 }

   if (TREE_CODE (decl) == ADDR_EXPR)
 decl = TREE_OPERAND (decl, 0);

+  /* To simplify calling code, store the referenced DECL regardless of
+ the attribute determined below, but avoid storing the SSA_NAME_VAR
+ obtained above (it's not useful for dataflow purposes).  */
   if (ref)
 *ref = decl;

-  if (TREE_CODE (decl) == ARRAY_REF)
+  /* Use the SSA_NAME_VAR that was determined above to see if it's
+ declared nonstring.  Otherwise drill down into the referenced
+ DECL.  */
+  if (var)
+decl = var;
+  else if (TREE_CODE (decl) == ARRAY_REF)
 decl = TREE_OPERAND (decl, 0);
   else if (TREE_CODE (decl) =

[PING] [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561)

2018-10-01 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html

We have discussed a number of different approaches to moving
the warning somewhere else but none is feasible in the limited
amount of time remaining in stage 1 of GCC 9.  I'd like to
avoid the false positive in GCC 9 by using the originally
submitted, simple approach and look into the suggested design
changes for GCC 10.

On 09/21/2018 08:36 AM, Martin Sebor wrote:

On 09/20/2018 03:06 AM, Richard Biener wrote:

On Wed, Sep 19, 2018 at 4:19 PM Martin Sebor  wrote:


On 09/18/2018 10:23 PM, Jeff Law wrote:

On 9/18/18 1:46 PM, Martin Sebor wrote:

On 09/18/2018 12:58 PM, Jeff Law wrote:

On 9/18/18 11:12 AM, Martin Sebor wrote:


My bad.  Sigh. CCP doesn't track copies, just constants, so
there's not
going to be any data structure you can exploit.  And I don't think
there's a value number you can use to determine the two objects
are the
same.

Hmm, let's back up a bit, what is does the relevant part of the
IL look
like before CCP?  Is the real problem here that we have
unpropagated
copies lying around in the IL?  Hmm, more likely the IL looksl ike:

   _8 = &pb_3(D)->a;
   _9 = _8;
   _1 = _9;
   strncpy (MEM_REF (&pb_3(D)->a), ...);
   MEM[(struct S *)_1].a[n_7] = 0;


Yes, that is what the folder sees while the strncpy call is
being transformed/folded by ccp.  The MEM_REF is folded just
after the strncpy call and that's when it's transformed into

  MEM[(struct S *)_8].a[n_7] = 0;

(The assignments to _1 and _9 don't get removed until after
the dom walk finishes).



If we were to propagate the copies out we'd at best have:

   _8 = &pb_3(D)->a;
   strncpy (MEM_REF (&pb_3(D)->a), ...);
   MEM[(struct S *)_8].a[n_7] = 0;


Is that in a form you can handle?  Or would we also need to forward
propagate the address computation into the use of _8?


The above works as long as we look at the def_stmt of _8 in
the MEM_REF (we currently don't).  That's also what the last
iteration of the loop does.  In this case (with _8) it would
be discovered in the first iteration, so the loop could be
replaced by a simple if statement.

But I'm not sure I understand the concern with the loop.  Is
it that we are looping at all, i.e., the cost?  Or that ccp
is doing something wrong or suboptimal? (Should have
propagated the value of _8 earlier?)

I suspect it's more a concern that things like copies are typically
propagated away.   So their existence in the IL (and consequently
your
need to handle them) raises the question "has something else
failed to
do its job earlier".

During which of the CCP passes is this happening?  Can we pull the
warning out of the folder (even if that means having a distinct
warning
pass over the IL?)


It happens during the third run of the pass.

The only way to do what you suggest that I could think of is
to defer the strncpy to memcpy transformation until after
the warning pass.  That was also my earlier suggestion: defer
both it and the warning until the tree-ssa-strlen pass (where
the warning is implemented to begin with -- the folder calls
into it).

If it's happening that late (CCP3) in general, then ISTM we ought to be
able to get the warning out of the folder.  We just have to pick the
right spot.

warn_restrict runs before fold_all_builtins, but after dom/vrp so we
should have the IL in pretty good shape.  That seems like about the
right time.

I wonder if we could generalize warn_restrict to be a more generic
warning pass over the IL and place it right before fold_builtins.


The restrict pass doesn't know about string lengths so it can't
handle all the warnings about string built-ins (the strlen pass
now calls into it to issue some).  The strlen pass does so it
could handle most if not all of them (the folder also calls
into it to issue some warnings).  It would work even better if
it were also integrated with the object size pass.

We're already working on merging strlen with sprintf.  It seems
to me that the strlen pass would benefit not only from that but
also from integrating with object size and warn-restrict.  With
that, -Wstringop-overflow could be moved from builtins.c into
it as well (and also benefit not only from accurate string
lengths but also from the more accurate object size info).

What do you think about that?


I think integrating the various "passes" (objectsize is also
as much a facility as a pass) generally makes sense given
it might end up improving all of them and reduce code duplication.


Okay.  If Jeff agrees I'll see if I can make it happen for GCC
10.  Until then, does either of you have any suggestions for
a different approach in this patch or is it acceptable with
the loop as is?

Martin



Richard.



Martin

PS I don't think I could do more than merger strlen and sprintf
before stage 1 ends (if even that much) so this would be a longer
term goal.






Re: [committed] Use structure to bubble up information about unterminated strings from c_strlen

2018-10-01 Thread Christophe Lyon
On Sat, 29 Sep 2018 at 18:06, Jeff Law  wrote:
>
>
> This patch changes the NONSTR argument to c_strlen to instead be a
> little data structure c_strlen can populate with nuggets of information
> about the string.
>
> There's clearly a need for the decl related to the non-string argument.
> I see an immediate need for the length of a non-terminated string
> (c_strlen returns NULL for non-terminated strings).  I also see a need
> for the offset within the non-terminated strong as well.
>
> We only populate the structure when c_strlen encounters a non-terminated
> string.  One could argue we should always fill in the members.  Right
> now I think filling it in for unterminated cases makes the most sense,
> but I could be convinced otherwise.
>
> I won't be surprised if subsequent warnings from Martin need additional
> information about the string.  The idea here is we can add more elements
> to the structure without continually adding arguments to c_strlen.
>
> Bootstrapped in isolation as well as with Martin's patches for strnlen
> and sprintf checking.  Installing on the trunk.
>

Hi Jeff,

+ /* If TYPE is asking for a maximum, then use any
+length (including the length of an unterminated
+string) for VAL.  */
+ if (type == 2)
+   val = data.len;

It seems this part is dead-code, since the case type==2 is handled in
the "then" part of the "if" (this code is in the "else" part).

Since you added a comment, I suspect you explicitly tested it, though?

Christophe

> Jeff


Re: [PATCH 6/6] detect unterminated const arrays in strnlen calls (PR 86552)

2018-10-01 Thread Jeff Law
On 8/13/18 3:29 PM, Martin Sebor wrote:
> The attached changes implement the detection of past-the-end reads
> by strncpy due to unterminated arguments and excessive bounds.
> 
> 
> gcc-86552-6.diff
> 
> PR tree-optimization/86552 - missing warning for reading past the end of 
> non-string arrays
> 
> gcc/ChangeLog:
>   * builtins.c (expand_builtin_strnlen): Detect, avoid expanding,
>   and diagnose unterminated arrays.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/warn-strnlen-no-nul.c: New.
So the changes to c_strlen's API allow us to simplify the changes you
made to unterminated_array.  Essentially we get to drop the code which
tears apart EXP before handing things off to c_strlen -- that's all
handled inside c_strlen/string_constant now.


c_strlen returns NULL for an unterminated array or anything it can't
handle.  So we check for NULL return value and a non-NULL data.decl to
see if we had an unterminated array.  We can get the length of the
unterminated string and the offset via the c_strlen_data we pass to
c_strlen in that case.

If the offset is a pure constant, then it will already be accounted for
in data->len.  So we no longer need to adjust it.  If the offset is
SSA_NAME + INTEGER_CST, we adjust the length by INTEGER_CST and bubble
up exact = false.


I think that summarizes the relatively minor changes I ended up making.

Bootstrapped and regression tested on x86_64.  Installing on the trunk.

Jeff
commit ab9a04daf8adffdb00fd085e6f217efeb42875ce
Author: Jeff Law 
Date:   Thu Aug 30 19:24:34 2018 -0400

* builtins.c (unterminated_array): Add new arguments.
If argument is not terminated, bubble up size and exact
state to callers.
(expand_builtin_strnlen): Detect, avoid expanding
and diagnose unterminated arrays.
(c_strlen): Fill in offset of start of unterminated strings.
* builtins.h (unterminated_array): Update prototype.

* gcc.dg/warn-strnlen-no-nul.c: New.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b43cc388fa8..05c6f558246 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,14 @@
+2018-10-01  Martin Sebor  
+   Jeff Law  
+
+   * builtins.c (unterminated_array): Add new arguments.
+   If argument is not terminated, bubble up size and exact
+   state to callers.
+   (expand_builtin_strnlen): Detect, avoid expanding
+   and diagnose unterminated arrays.
+   (c_strlen): Fill in offset of start of unterminated strings.
+   * builtins.h (unterminated_array): Update prototype.
+
 2018-10-01  Carl Love  
 
PR 69431
diff --git a/gcc/builtins.c b/gcc/builtins.c
index fe411efd9a9..2cb1996dad3 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -565,15 +565,50 @@ warn_string_no_nul (location_t loc, const char *fn, tree 
arg, tree decl)
 
 /* If EXP refers to an unterminated constant character array return
the declaration of the object of which the array is a member or
-   element.  Otherwise return null.  */
+   element and if SIZE is not null, set *SIZE to the size of
+   the unterminated array and set *EXACT if the size is exact or
+   clear it otherwise.  Otherwise return null.  */
 
 tree
-unterminated_array (tree exp)
+unterminated_array (tree exp, tree *size /* = NULL */, bool *exact /* = NULL 
*/)
 {
+  /* C_STRLEN will return NULL and set DECL in the info
+ structure if EXP references a unterminated array.  */
   c_strlen_data data;
   memset (&data, 0, sizeof (c_strlen_data));
-  c_strlen (exp, 1, &data);
-  return data.decl;
+  tree len = c_strlen (exp, 1, &data);
+  if (len == NULL_TREE && data.len && data.decl)
+ {
+   if (size)
+   {
+ len = data.len;
+ if (data.off)
+   {
+ /* Constant offsets are already accounted for in data.len, but
+not in a SSA_NAME + CST expression.  */
+ if (TREE_CODE (data.off) == INTEGER_CST)
+   *exact = true;
+ else if (TREE_CODE (data.off) == PLUS_EXPR
+  && TREE_CODE (TREE_OPERAND (data.off, 1)) == INTEGER_CST)
+   {
+ /* Subtract the offset from the size of the array.  */
+ *exact = false;
+ tree temp = TREE_OPERAND (data.off, 1);
+ temp = fold_convert (ssizetype, temp);
+ len = fold_build2 (MINUS_EXPR, ssizetype, len, temp);
+   }
+ else
+   *exact = false;
+   }
+ else
+   *exact = true;
+
+ *size = len;
+   }
+   return data.decl;
+ }
+
+  return NULL_TREE;
 }
 
 /* Compute the length of a null-terminated character string or wide
@@ -685,6 +720,7 @@ c_strlen (tree src, int only_value, c_strlen_data *data, 
unsigned eltsize)
   else if (len >= maxelts)
{
  data->decl = decl;
+ data->off = byteoff;
  data->len = ssize_int (len);
  return NULL_TREE;
   

Re: [committed] Use structure to bubble up information about unterminated strings from c_strlen

2018-10-01 Thread Jeff Law
On 10/1/18 3:46 PM, Christophe Lyon wrote:
> On Sat, 29 Sep 2018 at 18:06, Jeff Law  wrote:
>>
>>
>> This patch changes the NONSTR argument to c_strlen to instead be a
>> little data structure c_strlen can populate with nuggets of information
>> about the string.
>>
>> There's clearly a need for the decl related to the non-string argument.
>> I see an immediate need for the length of a non-terminated string
>> (c_strlen returns NULL for non-terminated strings).  I also see a need
>> for the offset within the non-terminated strong as well.
>>
>> We only populate the structure when c_strlen encounters a non-terminated
>> string.  One could argue we should always fill in the members.  Right
>> now I think filling it in for unterminated cases makes the most sense,
>> but I could be convinced otherwise.
>>
>> I won't be surprised if subsequent warnings from Martin need additional
>> information about the string.  The idea here is we can add more elements
>> to the structure without continually adding arguments to c_strlen.
>>
>> Bootstrapped in isolation as well as with Martin's patches for strnlen
>> and sprintf checking.  Installing on the trunk.
>>
> 
> Hi Jeff,
> 
> + /* If TYPE is asking for a maximum, then use any
> +length (including the length of an unterminated
> +string) for VAL.  */
> + if (type == 2)
> +   val = data.len;
> 
> It seems this part is dead-code, since the case type==2 is handled in
> the "then" part of the "if" (this code is in the "else" part).
> 
> Since you added a comment, I suspect you explicitly tested it, though?
Yea, I know that code got triggered at some point.  It may be dead now
after some cleanups, or it might have been needed by the patch I just
installed on the sprintf bits.  I'll double-check either way. (I'd seen
it in the coverity scans this morning as well).

jeff


Go patch committed: Use underlying type to build placeholder type for aliases

2018-10-01 Thread Ian Lance Taylor
This patch to the Go frontend by Cherry Zhang uses the underlying type
to build the placeholder type for aliases.  When asking for a
placeholder type of an alias type, this builds a placeholder for the
underlying type, instead of treating the alias as a named type and
calling get_backend.  The latter may fail as we may not be ready to
build a complete backend type.  We have already used a unified backend
type for alias type and its underlying type.  Do the same for
placeholders as well.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 264772)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-2f56d51c6b3104242613c74b02fa6c63a2fe16c5
+53d0d7ca278a5612fcdb5fb098e7bf950a0178ef
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/types.cc
===
--- gcc/go/gofrontend/types.cc  (revision 264648)
+++ gcc/go/gofrontend/types.cc  (working copy)
@@ -1125,6 +1125,8 @@ Type::get_backend_placeholder(Gogo* gogo
 case TYPE_FORWARD:
   // Named types keep track of their own dependencies and manage
   // their own placeholders.
+  if (this->named_type() != NULL && this->named_type()->is_alias())
+return this->unalias()->get_backend_placeholder(gogo);
   return this->get_backend(gogo);
 
 case TYPE_INTERFACE:


C++ PATCH to implement C++20 P0892R2 - explicit(bool)

2018-10-01 Thread Marek Polacek
This patch implements C++20 explicit(bool), as described in:
.

I tried to follow the noexcept specifier implementation where I could, which
made the non-template parts of this fairly easy.  To make explicit(expr) work
with dependent expressions, I had to add DECL_EXPLICIT_SPEC to lang_decl_fn,
which serves as a vessel to get the explicit-specifier to tsubst_function_decl
where I substitute the dependent arguments.

I've written a bunch of tests testing the basic functionality but I don't
doubt I've missed a couple of important scenarios.  I hope we can shake
those out as people start playing with this feature.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-10-01  Marek Polacek  

P0892R2 - explicit(bool)
* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_explicit_bool.

* call.c (add_template_candidate_real): Return if the declaration is
explicit and we're only looking for non-converting constructor.
* cp-tree.h (lang_decl_fn): Add explicit_specifier field.
(DECL_EXPLICIT_SPEC): New macro.
(build_explicit_specifier): Declare.
* decl.c (build_explicit_specifier): New function.
* parser.c (cp_parser_decl_specifier_seq): Add explicit_specifier
parameter.  Pass it down to cp_parser_function_specifier_opt.
(cp_parser_function_specifier_opt): Add explicit_specifier parameter.
: Parse C++20 explicit(bool).
(cp_parser_explicit_instantiation): Update call to
cp_parser_function_specifier_opt.
(cp_parser_member_declaration): Have cp_parser_decl_specifier_seq save
the explicit-specifier.  Save it to DECL_EXPLICIT_SPEC.
(cp_parser_single_declaration): Likewise.
* pt.c (tsubst_function_decl): Handle explicit(dependent-expr).

* g++.dg/cpp2a/explicit1.C: New test.
* g++.dg/cpp2a/explicit10.C: New test.
* g++.dg/cpp2a/explicit11.C: New test.
* g++.dg/cpp2a/explicit2.C: New test.
* g++.dg/cpp2a/explicit3.C: New test.
* g++.dg/cpp2a/explicit4.C: New test.
* g++.dg/cpp2a/explicit5.C: New test.
* g++.dg/cpp2a/explicit6.C: New test.
* g++.dg/cpp2a/explicit7.C: New test.
* g++.dg/cpp2a/explicit8.C: New test.
* g++.dg/cpp2a/explicit9.C: New test.

* testsuite/20_util/any/cons/explicit.cc: Adjust dg-error.
* testsuite/20_util/pair/cons/explicit_construct.cc: Likewise.
* testsuite/20_util/tuple/cons/explicit_construct.cc: Likewise.

diff --git gcc/gcc/c-family/c-cppbuiltin.c gcc/gcc/c-family/c-cppbuiltin.c
index 96a6b4dfd2b..b085cf9201f 100644
--- gcc/gcc/c-family/c-cppbuiltin.c
+++ gcc/gcc/c-family/c-cppbuiltin.c
@@ -955,7 +955,7 @@ c_cpp_builtins (cpp_reader *pfile)
}
   if (cxx_dialect > cxx14)
{
- /* Set feature test macros for C++1z.  */
+ /* Set feature test macros for C++17.  */
  cpp_define (pfile, "__cpp_unicode_characters=201411");
  cpp_define (pfile, "__cpp_static_assert=201411");
  cpp_define (pfile, "__cpp_namespace_attributes=201411");
@@ -975,6 +975,11 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_structured_bindings=201606");
  cpp_define (pfile, "__cpp_variadic_using=201611");
}
+  if (cxx_dialect > cxx17)
+   {
+ /* Set feature test macros for C++2a.  */
+ cpp_define (pfile, "__cpp_explicit_bool=201806");
+   }
   if (flag_concepts)
cpp_define (pfile, "__cpp_concepts=201507");
   if (flag_tm)
diff --git gcc/gcc/cp/call.c gcc/gcc/cp/call.c
index b2ca667c8b4..7003a4a2f50 100644
--- gcc/gcc/cp/call.c
+++ gcc/gcc/cp/call.c
@@ -3251,6 +3251,12 @@ add_template_candidate_real (struct z_candidate 
**candidates, tree tmpl,
   goto fail;
 }
 
+  /* Now the explicit specifier might have been deduced; check if this
+ declaration is explicit.  If it is and we're ignoring non-converting
+ constructors, don't add this function to the set of candidates.  */
+  if ((flags & LOOKUP_ONLYCONVERTING) && DECL_NONCONVERTING_P (fn))
+return NULL;
+
   if (DECL_CONSTRUCTOR_P (fn) && nargs == 2)
 {
   tree arg_types = FUNCTION_FIRST_USER_PARMTYPE (fn);
diff --git gcc/gcc/cp/cp-tree.h gcc/gcc/cp/cp-tree.h
index efbdad83966..6cee257b8a7 100644
--- gcc/gcc/cp/cp-tree.h
+++ gcc/gcc/cp/cp-tree.h
@@ -2604,6 +2604,10 @@ struct GTY(()) lang_decl_fn {
  will be chained on the return pointer thunk.  */
   tree context;
 
+  /* Explicit-specifier, if any.  Only constructors or conversion
+ functions can have it.  */
+  tree explicit_specifier;
+
   union lang_decl_u5
   {
 /* In a non-thunk FUNCTION_DECL or TEMPLATE_DECL, this is
@@ -4501,6 +4505,10 @@ more_aggr_init_expr_args_p (const 
aggr_init_expr_arg_iterator *iter)
 #define DECL_GLOBAL_DTOR_P(NODE) \
   (LANG_DECL_FN_CHECK (NODE)->global_dtor_p)
 
+/* Explicit-specifier for th

Re: [PATCH] detect attribute mismatches in alias declarations (PR 81824)

2018-10-01 Thread Joseph Myers
On Mon, 1 Oct 2018, Martin Sebor wrote:

> Testing the patch with Glibc triggers thousands of warnings of
> both kinds.  After reviewing a small subset it became apparent

Thousands of warnings suggests initially having the warning outside -Wall 
(though one might hope to move it into -Wall at some point, depending on 
how hard the warnings are to address and to what extent they appear at all 
for other packages - most don't make heavy use of aliases like that - or 
failing that, to enable it explicitly for glibc once all the warnings are 
fixed, since this is certainly a useful warning for glibc showing issues 
we want to fix) - it's not like the typical case of a new warning where 
you can quickly and easily fix all the instances in glibc, for all 
architectures, to keep it building with mainline GCC.

> attribute called copy.  The attribute copies attributes from
> one declaration (or type) to another.  The attribute doesn't
> resolve all the warnings but it helps.

(For actual use in glibc that use would of course need to be conditional 
on a GCC version supporting the attribute.)

> The class of warnings I noticed that can't be so easily handled
> are due to inconsistencies between ifuncs and their resolvers.
> One way to solve it might be to have resolvers automatically
> "inherit" all the attributes of their targets (and enhance
> GCC to warn for violations).  Another way might be to expect
> resolvers to be explicitly declared with attribute copy to copy
> the attributes of all the targets (and also warn for violations).

I'm not sure we should care about the attributes on IFUNC resolvers at 
all; no normal code will see a declaration of the resolver and also call 
the function, whereas lots of code calls functions under internal alias 
names that currently lack the same attributes as the public declaration 
has.  It's also not obvious whether there might be more cases of 
attributes for a function that are inapplicable to IFUNC resolvers than 
just the attributes relating to a symbol rather than the function itself 
which are hardcoded as excluded.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C++ PATCH to implement C++20 P0892R2 - explicit(bool)

2018-10-01 Thread Jason Merrill
On Mon, Oct 1, 2018 at 6:41 PM Marek Polacek  wrote:
>
> This patch implements C++20 explicit(bool), as described in:
> .
>
> I tried to follow the noexcept specifier implementation where I could, which
> made the non-template parts of this fairly easy.  To make explicit(expr) work
> with dependent expressions, I had to add DECL_EXPLICIT_SPEC to lang_decl_fn,
> which serves as a vessel to get the explicit-specifier to tsubst_function_decl
> where I substitute the dependent arguments.

What's the impact of that on memory consumption?  I'm nervous about
adding another word to most functions when it's not useful to most of
them.  For several similar things we've been using hash tables on the
side.

> +/* Create a representation of the explicit-specifier with
> +   constant-expression of EXPR.  COMPLAIN is as for tsubst.  */
> +
> +tree
> +build_explicit_specifier (tree expr, tsubst_flags_t complain)
> +{
> +  if (processing_template_decl && value_dependent_expression_p (expr))
> +/* Wait for instantiation.  tsubst_function_decl will take care of it.  
> */
> +return expr;
> +
> +  expr = perform_implicit_conversion_flags (boolean_type_node, expr,
> +   complain, LOOKUP_NORMAL);
> +  expr = instantiate_non_dependent_expr (expr);
> +  expr = cxx_constant_value (expr);
> +  return expr;
> +}

Is there a reason not to use build_converted_constant_expr?

Jason


Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread H.J. Lu
On Mon, Oct 1, 2018 at 1:27 PM, H.J. Lu  wrote:
> On Mon, Oct 1, 2018 at 1:18 PM, Ian Lance Taylor  wrote:
>> On Wed, Sep 26, 2018 at 7:50 AM, H.J. Lu  wrote:
>>> On Mon, Sep 24, 2018 at 2:46 PM, Ian Lance Taylor  wrote:
 I've committed a patch to update libgo to the 1.11 release.  As usual
 for these updates, the patch is too large to attach to this e-mail
 message.  I've attached some of the more relevant directories.  This
 update required some minor patches to the gotools directory and the Go
 testsuite, also included here.  Bootstrapped and ran Go testsuite on
 x86_64-pc-linux-gnu.  Committed to mainline.

 Ian

 2018-09-24  Ian Lance Taylor  

 * Makefile.am (mostlyclean-local): Run chmod on check-go-dir to
 make sure it is writable.
 (check-go-tools): Likewise.
 (check-vet): Copy internal/objabi to check-vet-dir.
 * Makefile.in: Rebuild.
>>>
>>> When building with -mx32, I got
>>>
>>> /export/gnu/import/git/sources/gcc/libgo/go/runtime/malloc.go:309:44:
>>> error: integer constant overflow
>>> 309 |  arenaBaseOffset uintptr = sys.GoarchAmd64 * (1 << 47)
>>> |^
>>
>>
>> Thanks.  I fixed this problem by switching to using amd64p32 on x32.
>> Bootstrapped and ran testsuite on x86_64-pc-linux-gnu using
>> --with-multilib-list=m64,m32,mx32.  However, I ran this on a kernel
>> without x32 support, so while building succeeds, I couldn't actually
>> run any tests.  Let me know how they do.
>>
>> Ian
>
> I am giving it try.
>

Compared with my patch, there are some new failures:

--- FAIL: TestAtomicStop (1.82s)
signal_test.go:384: iteration 5: output lost signal on tries: 2
signal_test.go:392: iteration 5: lost signal
FAIL
FAIL: os/signal

FAIL: go.test/test/env.go execution,  -O2 -g
FAIL: go.test/test/nilptr2.go execution,  -O2 -g
FAIL: go.test/test/nilptr2.go execution,  -O2 -g

FAIL: net/http

--- FAIL: TestExtraFiles (0.21s)
exec_test.go:611: Run: exit status 1; stdout "leaked parent file.
fd = 6; want 4\n", stderr ""
FAIL
FAIL: os/exec

goroutine 4538 [runnable]:
created by net_http_test.TestConcurrentServerServe

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest51963/test/serve_test.go:5394
+310

goroutine 4539 [runnable]:
created by net_http_test.TestConcurrentServerServe

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest51963/test/serve_test.go:5395
+421

eax0x0
ebx0x2
ecx0xa6cf6a5c
edx0x0
edi0x0
esi0x8
ebp0xa6cf6a5c
esp0xa6cf6a40
eip0xf7ed4069
eflags 0x282
cs 0x23
fs 0x0
gs 0x63
FAIL: net/http

goroutine 23819 [GC worker (idle)]:
runtime.mcall
/export/gnu/import/git/sources/gcc/libgo/runtime/proc.c:342
runtime.gopark

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/proc.go:333
runtime.gcBgMarkWorker

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/mgc.go:1773
runtime.kickoff

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/proc.go:1214
created by runtime.gcBgMarkStartWorkers

/export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/mgc.go:1719
+92

eax0x0
ebx0x2
ecx0xaa4ecaec
edx0x0
edi0x0
esi0x8
ebp0xaa4ecaec
esp0xaa4ecad0
eip0xf7f81069
eflags 0x286
cs 0x23
fs 0x0
gs 0x63
FAIL: runtime

FAIL: go.test/test/env.go execution,  -O2 -g
FAIL: go.test/test/nilptr2.go execution,  -O2 -g
FAIL: go.test/test/nilptr2.go execution,  -O2 -g

-- 
H.J.


Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread Ian Lance Taylor
On Mon, Oct 1, 2018 at 4:56 PM, H.J. Lu  wrote:
>
> Compared with my patch, there are some new failures:

Thanks.  We probably need a patch in gcc/testsuite/go.test/go-test.exp
to set goarch to amd64p32 when appropriate.

Other than that there seems to be some sort of signal handling
problem.  Hard to say what that might be.

Ian


> --- FAIL: TestAtomicStop (1.82s)
> signal_test.go:384: iteration 5: output lost signal on tries: 2
> signal_test.go:392: iteration 5: lost signal
> FAIL
> FAIL: os/signal
>
> FAIL: go.test/test/env.go execution,  -O2 -g
> FAIL: go.test/test/nilptr2.go execution,  -O2 -g
> FAIL: go.test/test/nilptr2.go execution,  -O2 -g
>
> FAIL: net/http
>
> --- FAIL: TestExtraFiles (0.21s)
> exec_test.go:611: Run: exit status 1; stdout "leaked parent file.
> fd = 6; want 4\n", stderr ""
> FAIL
> FAIL: os/exec
>
> goroutine 4538 [runnable]:
> created by net_http_test.TestConcurrentServerServe
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest51963/test/serve_test.go:5394
> +310
>
> goroutine 4539 [runnable]:
> created by net_http_test.TestConcurrentServerServe
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest51963/test/serve_test.go:5395
> +421
>
> eax0x0
> ebx0x2
> ecx0xa6cf6a5c
> edx0x0
> edi0x0
> esi0x8
> ebp0xa6cf6a5c
> esp0xa6cf6a40
> eip0xf7ed4069
> eflags 0x282
> cs 0x23
> fs 0x0
> gs 0x63
> FAIL: net/http
>
> goroutine 23819 [GC worker (idle)]:
> runtime.mcall
> /export/gnu/import/git/sources/gcc/libgo/runtime/proc.c:342
> runtime.gopark
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/proc.go:333
> runtime.gcBgMarkWorker
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/mgc.go:1773
> runtime.kickoff
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/proc.go:1214
> created by runtime.gcBgMarkStartWorkers
> 
> /export/build/gnu/tools-build/gcc-x32/build-x86_64-linux/x86_64-pc-linux-gnu/32/libgo/gotest93522/test/mgc.go:1719
> +92
>
> eax0x0
> ebx0x2
> ecx0xaa4ecaec
> edx0x0
> edi0x0
> esi0x8
> ebp0xaa4ecaec
> esp0xaa4ecad0
> eip0xf7f81069
> eflags 0x286
> cs 0x23
> fs 0x0
> gs 0x63
> FAIL: runtime
>
> FAIL: go.test/test/env.go execution,  -O2 -g
> FAIL: go.test/test/nilptr2.go execution,  -O2 -g
> FAIL: go.test/test/nilptr2.go execution,  -O2 -g
>
> --
> H.J.


Use -fno-show-column in libstdc++ installed testing

2018-10-01 Thread Joseph Myers
 arranged for
libstdc++ tests to use -fno-show-column by default, but only for
build-tree testing.  This patch adds it to the options used for
installed testing as well.

Tested with installed testing for a cross to x86_64-linux-gnu, where
it fixes various test failures.

2018-10-02  Joseph Myers  

* testsuite/lib/libstdc++.exp (libstdc++_init): Use
-fno-show-column in default cxxflags.

Index: libstdc++-v3/testsuite/lib/libstdc++.exp
===
--- libstdc++-v3/testsuite/lib/libstdc++.exp(revision 264770)
+++ libstdc++-v3/testsuite/lib/libstdc++.exp(working copy)
@@ -239,7 +239,7 @@ proc libstdc++_init { testfile } {
 
 # Default settings.
 set cxx [transform "g++"]
-set cxxflags "-fmessage-length=0"
+set cxxflags "-fmessage-length=0 -fno-show-column"
 set cxxpchflags ""
 set cxxvtvflags ""
 set cxxldflags ""

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread H.J. Lu
On Mon, Oct 1, 2018 at 5:06 PM Ian Lance Taylor  wrote:
>
> On Mon, Oct 1, 2018 at 4:56 PM, H.J. Lu  wrote:
> >
> > Compared with my patch, there are some new failures:
>
> Thanks.  We probably need a patch in gcc/testsuite/go.test/go-test.exp
> to set goarch to amd64p32 when appropriate.
>
> Other than that there seems to be some sort of signal handling
> problem.  Hard to say what that might be.
>

Does amd64p32 disable any amd64 specific handling?

-- 
H.J.


Re: [PATCH][2/n] Make _INLINE_ENTRY markers have the location we finally need

2018-10-01 Thread Alexandre Oliva
On Sep 28, 2018, Richard Biener  wrote:

> Alex - any particular reason for not doing this?

The only concern that comes to mind is a vague thought about this
possibly messing up the nesting of lexical scopes in executable code,
but I don't think that's a given to begin with, I like this move, and
I'm sure any problem along these lines that arises, if any, can be fixed
without much trouble.  So, as far as I'm concerned, go for it, and thanks!

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: [PATCH 0/2][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-01 Thread Peter Bergner
On 10/1/18 7:45 AM, H.J. Lu wrote:
> You may have undone:
> 
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=218059

Yes, the code above also needed to be modified to handle conflicts being
added at definitions rather than at uses.  The patch below does that.
I don't really have access to a i686 (ie, 32-bit) system to test on and
I'm not sure how to force the test to be run in 32-bit mode on a 64-bit
build, but it does fix the assembler for the pr63534.c test case.

That said, looking at the rtl for the test case, I see the following
before RA:

(insn 5 2 6 2 (set (reg:SI 3 bx)
(reg:SI 82)) "pr63534.c":10 85 {*movsi_internal}
 (nil))
(call_insn 6 5 7 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  
) [0 barD.1498 S1 A8])
(const_int 0 [0])) "pr63534.c":10 687 {*call}
 (expr_list:REG_DEAD (reg:SI 3 bx)
(expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  
)
(nil)))
(expr_list (use (reg:SI 3 bx))
(nil)))
(insn 7 6 8 2 (set (reg:SI 3 bx)
(reg:SI 82)) "pr63534.c":11 85 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 82)
(nil)))
(call_insn 8 7 0 2 (call (mem:QI (symbol_ref:SI ("bar") [flags 0x41]  
) [0 barD.1498 S1 A8])
(const_int 0 [0])) "pr63534.c":11 687 {*call}
 (expr_list:REG_DEAD (reg:SI 3 bx)
(expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41]  
)
(nil)))
(expr_list (use (reg:SI 3 bx))
(nil)))

Now that we handle conflicts at definitions and the pic hard reg
is set via a copy from the pic pseudo, my PATCH 2 is setup to
handle exactly this scenario (ie, a copy between a pseudo and
a hard reg).  I looked at the asm output from a build with both
PATCH 1 and PATCH 2, and yes, it also does not add the conflict
between the pic pseudo and pic hard reg, so our other option to
fix PR87479 is to apply PATCH 2.  However, since PATCH 2 handles
the pic pseudo and pic hard reg conflict itself, that means we
don't need the special pic conflict code and it can be removed!
I'm going to update PATCH 2 to remove that pic handling code
and send it through bootstrap and regtesting.

H.J., can you confirm that the following patch not only fixes
the bug you opened, but also doesn't introduce any more?
Once I've updated PATCH 2, I'd like you to test/bless that
one as well.  Thanks.

Peter


gcc/
PR rtl-optimization/87479
* ira-lives.c (process_bb_node_lives): Move handling of pic pseudo
and pic hard reg conflict to the insn that sets pic hard reg.
* lra-lives.c (mark_regno_dead) : New function
argument.  Use it.
(process_bb_lives): Use new argument to mark_regno_dead.
Don't handle pic pseudo and pic hard reg conflict when processing
function call arguments.

Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c (revision 264758)
+++ gcc/ira-lives.c (working copy)
@@ -1108,20 +1108,25 @@ process_bb_node_lives (ira_loop_tree_node_t loop_t
 
  call_p = CALL_P (insn);
 #ifdef REAL_PIC_OFFSET_TABLE_REGNUM
- int regno;
+ unsigned int regno;
+ rtx set;
  bool clear_pic_use_conflict_p = false;
- /* Processing insn usage in call insn can create conflict
-with pic pseudo and pic hard reg and that is wrong.
-Check this situation and fix it at the end of the insn
-processing.  */
- if (call_p && pic_offset_table_rtx != NULL_RTX
+ /* Processing insn definition of REAL_PIC_OFFSET_TABLE_REGNUM
+can create a conflict between the pic pseudo and pic hard reg
+and that is wrong.  Check this situation and fix it at the end
+of the insn processing.  */
+ if (pic_offset_table_rtx != NULL_RTX
  && (regno = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER
- && (a = ira_curr_regno_allocno_map[regno]) != NULL)
+ && (a = ira_curr_regno_allocno_map[regno]) != NULL
+ && (set = single_set (insn)) != NULL_RTX
+ && REG_P (SET_DEST (set))
+ && REGNO (SET_DEST (set)) == REAL_PIC_OFFSET_TABLE_REGNUM
+ && REG_P (SET_SRC (set))
+ && REGNO (SET_SRC (set)) == regno)
clear_pic_use_conflict_p
-   = (find_regno_fusage (insn, USE, REAL_PIC_OFFSET_TABLE_REGNUM)
-  && ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
-  (ALLOCNO_OBJECT (a, 0)),
-  REAL_PIC_OFFSET_TABLE_REGNUM));
+   = ! TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS
+  (ALLOCNO_OBJECT (a, 0)),
+  REAL_PIC_OFFSET_TABLE_REGNUM);
 #endif
 
  /* Mark each defined value as live.  We need to do this for
Index: gcc/lra-lives.c
===

Re: libgo patch committed: Update to 1.11 release

2018-10-01 Thread Ian Lance Taylor
On Mon, Oct 1, 2018 at 6:57 PM, H.J. Lu  wrote:
> On Mon, Oct 1, 2018 at 5:06 PM Ian Lance Taylor  wrote:
>>
>> On Mon, Oct 1, 2018 at 4:56 PM, H.J. Lu  wrote:
>> >
>> > Compared with my patch, there are some new failures:
>>
>> Thanks.  We probably need a patch in gcc/testsuite/go.test/go-test.exp
>> to set goarch to amd64p32 when appropriate.
>>
>> Other than that there seems to be some sort of signal handling
>> problem.  Hard to say what that might be.
>>
>
> Does amd64p32 disable any amd64 specific handling?

Not as far as I can see.

Ian


[PATCH][rs6000][PR target/87474] fix strncmp expansion with -mno-power8-vector

2018-10-01 Thread Aaron Sawdey
PR/87474 happens because I didn't check that both vector and VSX instructions
were enabled, so insns that are disabled get generated with -mno-power8-vector.

Regstrap passes on ppc64le, ok for trunk?

Thanks!
  Aaron



2018-10-01  Aaron Sawdey  

PR target/87474
* config/rs6000/rs6000-string.c (expand_strn_compare): Check that both
vector and VSX are enabled.


Index: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c   (revision 264760)
+++ gcc/config/rs6000/rs6000-string.c   (working copy)
@@ -2205,6 +2205,7 @@
 }
   else
 {
+  /* Implies TARGET_P8_VECTOR here. */
   rtx diffix = gen_reg_rtx (DImode);
   rtx result_gbbd = gen_reg_rtx (V16QImode);
   /* Since each byte of the input is either 00 or FF, the bytes in
@@ -2313,9 +2314,12 @@
   /* Is it OK to use vec/vsx for this. TARGET_VSX means we have at
  least POWER7 but we use TARGET_EFFICIENT_UNALIGNED_VSX which is
  at least POWER8.  That way we can rely on overlapping compares to
- do the final comparison of less than 16 bytes.  Also I do not want
- to deal with making this work for 32 bits.  */
-  int use_vec = (bytes >= 16 && !TARGET_32BIT && 
TARGET_EFFICIENT_UNALIGNED_VSX);
+ do the final comparison of less than 16 bytes.  Also I do not
+ want to deal with making this work for 32 bits.  In addition, we
+ have to make sure that we have at least P8_VECTOR (we don't allow
+ P9_VECTOR without P8_VECTOR).  */
+  int use_vec = (bytes >= 16 && !TARGET_32BIT
+&& TARGET_EFFICIENT_UNALIGNED_VSX && TARGET_P8_VECTOR);

   if (use_vec)
 required_align = 16;


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain