Re: [PATCH] sparc: Set noexecstack on mulsi3, divsi3, and modsi3

2017-05-15 Thread Eric Botcazou
> This patch adds them missing noexectack on sparc assembly implementation.  I
> saw no regression on gcc testsuite and it fixes the regression on GLIBC
> side.
> 
> libgcc/
> 
>   * config/sparc/lb1spc.S [__ELF__ && __linux__]: Emit .note.GNU-stack
>   section for a non-executable stack.

Thanks for fixing this, applied to all active branches.

-- 
Eric Botcazou


Re: [PATCH] Fix PR80222

2017-05-15 Thread Richard Biener
On Sat, 13 May 2017, Eric Botcazou wrote:

> > Does this happen on the GCC7 branch as well?  The patch just guards an
> > indirect ref folding (I refrained from trying to make it correct given I
> > think it's premature optimization).
> 
> No, mainline and GCC 7 branch are fine.  It appears that the folding 
> (probably 
> to BIT_FIELD_REF) is necessary to break apart the store on the 6 branch.
> 
> > I'll try to investigate on Monday if you don't beat me to it.  Feel free to
> > revert the backport in the meantime.
> 
> No urgency, it's rather marginal.
> 
> > Note I think you can trigger the same bug with some source changes
> > independent of the patch which means the GENERIC must be somehow invalid.
> 
> Possibly so, yes.

So on the branch the frontends generate

  *((long long int * {ref-all}) &p->b + 8) = 5;

that's bogus to the effect that the dereference happens in type
'long long int' rather than an unaligned variant of it.  On the
GCC 7 branch and trunk we now generate

  VIEW_CONVERT_EXPR(p->b)[1] = 5;

which does not introduce an artificial dereference and does the folding
(even for variable indices) directly here.  Which also hints at that
we mishandle

typedef long long V
__attribute__ ((vector_size (2 * sizeof (long long)), may_alias));

typedef struct S { V b; } P __attribute__((aligned (1)));

__attribute__((noinline, noclone)) void
bar (P *p, int i)
{
  p->b[i] = 5;
}

even before the patch:

  *((long long int * {ref-all}) &p->b + (sizetype) ((unsigned int) i * 8)) 
= 5;

which cannot be folded into a BIT_FIELD_REF but we end up with

  V * {ref-all} D.1412;
  unsigned int i.0;
  unsigned int D.1414;
  long long int * {ref-all} D.1415;

  D.1412 = &p->b;
  i.0 = (unsigned int) i;
  D.1414 = i.0 * 8;
  D.1415 = D.1412 + D.1414;
  *D.1415 = 5;

and

bar:
save%sp, -96, %sp
st  %i0, [%fp+68]
st  %i1, [%fp+72]
ld  [%fp+68], %g2
ld  [%fp+72], %g1
sll %g1, 3, %g1
add %g2, %g1, %g1
mov 0, %g2
mov 5, %g3
std %g2, [%g1]
nop
restore
jmp %o7+8

which (not knowing SPARC assembly) looks bogus as well.

At this point I'll favor simply reverting the patch from the branch
rather than trying to backport the IL change for GENERIC.

Thus consider that done.

Richard.


[Ada] Relax alignment constraint for tagged types on x86

2017-05-15 Thread Eric Botcazou
This partially relaxes alignment constraint for tagged types on x86 and, more 
generally, architectures that do not require strict alignment for memory 
accesses, for historical reasons: the compiler now accepts size clauses on 
record type extensions that effectively lower the alignment of the type, if 
there is also a representation clause on the type.

Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Eric Botcazou  

* gcc-interface/decl.c (gnat_to_gnu_entity) : Whenthere
is a representation clause on an extension, propagate the alignment of
the parent type only if the platform requires strict alignment.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 247951)
+++ gcc-interface/decl.c	(working copy)
@@ -204,6 +204,7 @@ static tree elaborate_expression_2 (tree
 static tree elaborate_reference (tree, Entity_Id, bool, tree *);
 static tree gnat_to_gnu_component_type (Entity_Id, bool, bool);
 static tree gnat_to_gnu_subprog_type (Entity_Id, bool, bool, tree *);
+static int adjust_packed (tree, tree, int);
 static tree gnat_to_gnu_field (Entity_Id, tree, int, bool, bool);
 static tree gnu_ext_name_for_subprog (Entity_Id, tree);
 static tree change_qualified_type (tree, int);
@@ -3094,6 +3095,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	Entity_Id gnat_parent = Parent_Subtype (gnat_entity);
 	tree gnu_dummy_parent_type = make_node (RECORD_TYPE);
 	tree gnu_parent;
+	int parent_packed = 0;
 
 	/* A major complexity here is that the parent subtype will
 	   reference our discriminants in its Stored_Constraint list.
@@ -3172,7 +3174,17 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	   be created with a component clause below, then we need
 	   to apply the same adjustment as in gnat_to_gnu_field.  */
 	if (has_rep && TYPE_ALIGN (gnu_type) < TYPE_ALIGN (gnu_parent))
-	  SET_TYPE_ALIGN (gnu_type, TYPE_ALIGN (gnu_parent));
+	  {
+		/* ??? For historical reasons, we do it on strict-alignment
+		   platforms only, where it is really required.  This means
+		   that a confirming representation clause will change the
+		   behavior of the compiler on the other platforms.  */
+		if (STRICT_ALIGNMENT)
+		  SET_TYPE_ALIGN (gnu_type, TYPE_ALIGN (gnu_parent));
+		else
+		  parent_packed
+		= adjust_packed (gnu_parent, gnu_type, parent_packed);
+	  }
 
 	/* Finally we fix up both kinds of twisted COMPONENT_REF we have
 	   initially built.  The discriminants must reference the fields
@@ -3218,7 +3230,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
    ? TYPE_SIZE (gnu_parent) : NULL_TREE,
    has_rep
    ? bitsize_zero_node : NULL_TREE,
-   0, 1);
+   parent_packed, 1);
 	DECL_INTERNAL_P (gnu_field) = 1;
 	TREE_OPERAND (gnu_get_parent, 1) = gnu_field;
 	TYPE_FIELDS (gnu_type) = gnu_field;


Re: dejagnu version update?

2017-05-15 Thread Richard Biener
On Mon, May 15, 2017 at 12:09 AM, NightStrike  wrote:
> On Sat, May 13, 2017 at 4:39 PM, Jeff Law  wrote:
>> On 05/13/2017 04:38 AM, Jakub Jelinek wrote:
>>>
>>> On Sat, May 13, 2017 at 12:24:12PM +0200, Bernhard Reutner-Fischer wrote:

 I guess neither redhat
 (https://access.redhat.com/downloads/content/dejagnu/ redirects to a
 login page but there seem to be 1.5.1 packages) nor SuSE did update
 dejagnu in the meantime.
>>>
>>>
>>> Fedora has dejagnu-1.6 in Fedora 25 and later, dejagnu-1.5.3 in Fedora 24,
>>> older
>>> Fedora versions are EOL.  RHEL 7 has dejagnu-1.5.1, RHEL 6 as well as RHEL
>>> 5 has
>>> dejagnu-1.4.4, older RHEL versions are EOL.
>>
>> RHEL-5 is old enough that IMHO it ought not figure into this discussion.
>> RHEL-6 is probably close to if not past that same point as well.
>
> FWIW, I still run the testsuite on RHEL 6.

Both SLE-11 and SLE-12 use dejagnu 1.4.4, so does openSUSE Leap 42.[12].
Tumbleweed uses 1.6 so new SLE will inherit that.  But I still do all
of my testing
on systems with just dejagnu 1.4.4.

Richard.


[Ada] A bit of housekeeping work in gigi

2017-05-15 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Eric Botcazou  

* gcc-interface/trans.c (gnat_to_gnu) : Fix formatting.
: Use properly typed constants.
(extract_values): Move around.
(pos_to_constructor): Minor tweaks.
(Sloc_to_locus): Fix formatting.
* gcc-interface/utils.c (process_deferred_decl_context): Minor tweaks.
* gcc-interface/gigi.h (MARK_VISITED): Remove blank line.
(Gigi_Equivalent_Type): Adjust head comment.
* gcc-interface/decl.c (Gigi_Equivalent_Type): Likewise.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 248049)
+++ gcc-interface/decl.c	(working copy)
@@ -3270,12 +3270,12 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	}
 
 	/* If we have a derived untagged type that renames discriminants in
-	   the root type, the (stored) discriminants are just a copy of the
-	   discriminants of the root type.  This means that any constraints
-	   added by the renaming in the derivation are disregarded as far
-	   as the layout of the derived type is concerned.  To rescue them,
-	   we change the type of the (stored) discriminants to a subtype
-	   with the bounds of the type of the visible discriminants.  */
+	   the parent type, the (stored) discriminants are just a copy of the
+	   discriminants of the parent type.  This means that any constraints
+	   added by the renaming in the derivation are disregarded as far as
+	   the layout of the derived type is concerned.  To rescue them, we
+	   change the type of the (stored) discriminants to a subtype with
+	   the bounds of the type of the visible discriminants.  */
 	if (has_discr
 	&& !is_extension
 	&& Stored_Constraint (gnat_entity) != No_Elist)
@@ -4967,12 +4967,10 @@ finalize_from_limited_with (void)
 }
 }
 
-/* Return the equivalent type to be used for GNAT_ENTITY, if it's a
-   kind of type (such E_Task_Type) that has a different type which Gigi
-   uses for its representation.  If the type does not have a special type
-   for its representation, return GNAT_ENTITY.  If a type is supposed to
-   exist, but does not, abort unless annotating types, in which case
-   return Empty.  If GNAT_ENTITY is Empty, return Empty.  */
+/* Return the equivalent type to be used for GNAT_ENTITY, if it's a kind
+   of type (such E_Task_Type) that has a different type which Gigi uses
+   for its representation.  If the type does not have a special type for
+   its representation, return GNAT_ENTITY.  */
 
 Entity_Id
 Gigi_Equivalent_Type (Entity_Id gnat_entity)
Index: gcc-interface/gigi.h
===
--- gcc-interface/gigi.h	(revision 247951)
+++ gcc-interface/gigi.h	(working copy)
@@ -88,7 +88,6 @@ extern void mark_visited (tree t);
 
 /* This macro calls the above function but short-circuits the common
case of a constant to save time and also checks for NULL.  */
-
 #define MARK_VISITED(EXP)		\
 do {	\
   if((EXP) && !CONSTANT_CLASS_P (EXP))	\
@@ -98,12 +97,10 @@ do {	\
 /* Finalize the processing of From_Limited_With incomplete types.  */
 extern void finalize_from_limited_with (void);
 
-/* Return the equivalent type to be used for GNAT_ENTITY, if it's a
-   kind of type (such E_Task_Type) that has a different type which Gigi
-   uses for its representation.  If the type does not have a special type
-   for its representation, return GNAT_ENTITY.  If a type is supposed to
-   exist, but does not, abort unless annotating types, in which case
-   return Empty.   If GNAT_ENTITY is Empty, return Empty.  */
+/* Return the equivalent type to be used for GNAT_ENTITY, if it's a kind
+   of type (such E_Task_Type) that has a different type which Gigi uses
+   for its representation.  If the type does not have a special type for
+   its representation, return GNAT_ENTITY.  */
 extern Entity_Id Gigi_Equivalent_Type (Entity_Id gnat_entity);
 
 /* Given GNAT_ENTITY, elaborate all expressions that are required to
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 247951)
+++ gcc-interface/trans.c	(working copy)
@@ -237,7 +237,6 @@ static tree build_binary_op_trapv (enum
 static tree convert_with_check (Entity_Id, tree, bool, bool, bool, Node_Id);
 static bool addressable_p (tree, tree);
 static tree assoc_to_constructor (Entity_Id, Node_Id, tree);
-static tree extract_values (tree, tree);
 static tree pos_to_constructor (Node_Id, tree, Entity_Id);
 static void validate_unchecked_conversion (Node_Id);
 static tree maybe_implicit_deref (tree);
@@ -6497,8 +6496,7 @@ gnat_to_gnu (Node_Id gnat_node)
 	  gnu_aggr_type = TYPE_REPRESENTATIVE_ARRAY (gnu_result_type);
 
 	if (Null_Record_Present (gnat_node))
-	  gnu_result = gnat_build_constructor (gnu_aggr_type,
-	   NULL);
+	  gnu_result = gnat_build_constructor (gnu_

[Ada] Small tweaks related to inlining

2017-05-15 Thread Eric Botcazou
In Ada, pragma Inline_Always (which is specific to GNAT) guarantees both that 
all direct calls are inlined and that there are no indirect references to the 
subprogram.  That's why the out-of-line body of a subprogram subject to pragma 
Inline_Always can always be eliminated.

Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Eric Botcazou  

* gcc-interface/trans.c (Compilation_Unit_to_gnu): Skip subprograms on
the inlined list that are not public.
* gcc-interface/utils.c (create_subprog_decl): Clear TREE_PUBLIC if
there is a pragma Inline_Always on the subprogram.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 248050)
+++ gcc-interface/trans.c	(working copy)
@@ -5472,6 +5472,15 @@ Compilation_Unit_to_gnu (Node_Id gnat_no
   if (!optimize && !Has_Pragma_Inline_Always (gnat_entity))
 	continue;
 
+  /* The set of inlined subprograms is computed from data recorded early
+	 during expansion and it can be a strict superset of the final set
+	 computed after semantic analysis, for example if a call to such a
+	 subprogram occurs in a pragma Assert and assertions are disabled.
+	 In that case, semantic analysis resets Is_Public to false but the
+	 entry for the subprogram in the inlining tables is stalled.  */
+  if (!Is_Public (gnat_entity))
+	continue;
+
   gnat_body = Parent (Declaration_Node (gnat_entity));
   if (Nkind (gnat_body) != N_Subprogram_Body)
 	{
Index: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 248050)
+++ gcc-interface/utils.c	(working copy)
@@ -3220,10 +3220,19 @@ create_subprog_decl (tree name, tree asm
 
 case is_required:
   if (Back_End_Inlining)
-	decl_attributes (&subprog_decl,
-			 tree_cons (get_identifier ("always_inline"),
-NULL_TREE, NULL_TREE),
-			 ATTR_FLAG_TYPE_IN_PLACE);
+	{
+	  decl_attributes (&subprog_decl,
+			   tree_cons (get_identifier ("always_inline"),
+  NULL_TREE, NULL_TREE),
+			   ATTR_FLAG_TYPE_IN_PLACE);
+
+	  /* Inline_Always guarantees that every direct call is inlined and
+	 that there is no indirect reference to the subprogram, so the
+	 instance in the original package (as well as its clones in the
+	 client packages created for inter-unit inlining) can be made
+	 private, which causes the out-of-line body to be eliminated.  */
+	  TREE_PUBLIC (subprog_decl) = 0;
+	}
 
   /* ... fall through ... */
 


Re: PR78972, 80283: Extend TER with scheduling

2017-05-15 Thread Richard Biener
On Fri, May 12, 2017 at 7:51 PM, Bernd Schmidt  wrote:
> If you look at certain testcases like the one for PR78972, you'll find that
> the code generated by TER is maximally pessimal in terms of register
> pressure: we can generate a large number of intermediate results, and defer
> all the statements that use them up.
>
> Another observation one can make is that doing TER doesn't actually buy us
> anything for a large subset of the values it finds: only a handful of places
> in the expand phase actually make use of the information. In cases where we
> know we aren't going to be making use of it, we could move expressions
> freely without doing TER-style substitution.
>
> This patch uses the information collected by TER about the moveability of
> statements and performs a mini scheduling pass with the aim of reducing
> register pressure. The heuristic is fairly simple: something that consumes
> more values than it produces is preferred. This could be tuned further, but
> it already produces pretty good results: for the 78972 testcase, the stack
> size is reduced from 2448 bytes to 288, and for PR80283, the stackframe of
> 496 bytes vanishes with the pass enabled.
>
> In terms of benchmarks I've run SPEC a few times, and the last set of
> results showed not much of a change. Getting reproducible results has been
> tricky but all runs I made have been within 0%-1% improvement.
>
> In this patch, the changed behaviour is gated with a -fschedule-ter option
> which is off by default; with that default it bootstraps and tests without
> regressions. The compiler also bootstraps with the option enabled, in that
> case there are some optimization issues. I'll address some of them with two
> followup patches, the remaining failures are:
>  * a handful of guality/PR43077.c failures
>Debug insn generation is somewhat changed, and the peephole2 pass
>drops one of them on the floor.
>  * three target/i386/bmi-* tests fail. These expect the combiner to
>build certain instruction patterns, and that turns out to be a
>little fragile. It would be nice to be able to use match.pd to
>produce target-specific patterns during expand.
>
> Thoughts? Ok to apply?

I appreciate that you experimented with partially disabling TER.  Last year
I tried to work towards this in a more aggressive way:

https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02062.html

that patch tried to preserve the scheduling effect of TER because there's
on my list of nice things to have a GIMPLE scheduling pass that should
try to reduce (SSA) register pressure and that can work with GIMPLE
data dependences.

One of the goals of the patch above was to actually _see_ the scheduling
effects in the IL.

So what I'd like to see is a simple single-BB scheduling pass right before
RTL expansion (so we get a dump file).  That can use your logic (and
"TERable" would be simply having single-uses).  The advantage of doing
this before RTL expansion is that coalescing can benefit from the scheduling
as well.

Then simply disable TER for the decide_schedule_stmt () defs during
RTL expansion.

That means the effect of TER scheduling is not fully visible but we're
a step closer.  It also means that some of the scheduling we did
in the simple scheduler persists anyway because coalescing / TER
wasn't going to undo it anyway.

In the (very) distant future I'd like to perform (more) instruction selection
on GIMPLE so that all the benefits of TER are applied before RTL
expansion.

+  tree_code c = gimple_assign_rhs_code (use_stmt);
+  if (TREE_CODE_CLASS (c) != tcc_comparison
+ && c != FMA_EXPR
+ && c != SSA_NAME
+ && c != MEM_REF
+ && c != TARGET_MEM_REF
+ && def_c != VIEW_CONVERT_EXPR)

I think on some archs it was important to handle combining
POINTER_PLUS_EXPR with NEGATE_EXPR of the offset.

Anyway, the effects of TER and where it matters are hard to
see given its recursive nature (and the history of trying to
preserve expanding of "large" GENERIC trees ...).  One would
think combine should be able to handle all those cases
(for example the FMA_EXPR one from above), but it clearly
isn't (esp. in the case of forwarding memory references).

Richard.

>
> Bernd


Re: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

2017-05-15 Thread Tamar Christina
ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 10:08:35 AM
To: GCC Patches
Cc: nd; James Greenhalgh; Richard Earnshaw; Marcus Shawcroft
Subject: Re: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Wednesday, March 15, 2017 4:04:35 PM
To: GCC Patches
Cc: nd; James Greenhalgh; Richard Earnshaw; Marcus Shawcroft
Subject: [PATCH][GCC][AArch64] Fix subreg bug in scalar copysign

Hi All,

This fixes a bug in the scalar version of copysign where due to a subreg
were generating less than efficient code.

This patch replaces

  return x * __builtin_copysignf (150.0f, y);

which used to generate

adrpx1, .LC1
mov x0, 2147483648
ins v3.d[0], x0
ldr s2, [x1, #:lo12:.LC1]
bsl v3.8b, v1.8b, v2.8b
fmuls0, s0, s3
ret

.LC1:
.word   1125515264

with
mov x0, 1125515264
moviv2.2s, 0x80, lsl 24
fmovd3, x0
bit v3.8b, v1.8b, v2.8b
fmuls0, s0, s3
ret

removing the incorrect ins.

Regression tested on aarch64-none-linux-gnu and no regressions.

OK for trunk?

Thanks,
Tamar

gcc/
2017-03-15  Tamar Christina  

* config/aarch64/aarch64.md
(copysignsf3): Fix mask generation.


Re: [ARM] Enable FP16 vector arithmetic operations.

2017-05-15 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 3:46:49 PM
To: Matthew Wahab; gcc-patches
Cc: nd; ni...@redhat.com; Richard Earnshaw; Ramana Radhakrishnan; Kyrylo Tkachov
Subject: Re: [ARM] Enable FP16 vector arithmetic operations.

Hi All,

I'm taking this one over from Matthew, I think it slipped through the cracks 
before.

Since it still applies cleanly on trunk I'm just pinging it.

Ok for trunk?

Tamar

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Matthew Wahab 
Sent: Friday, September 23, 2016 4:02 PM
To: gcc-patches
Subject: [ARM] Enable FP16 vector arithmetic operations.

Hello,

Support for the ARMv8.2-A FP16 NEON arithmetic instructions was added
using non-standard names for the instruction patterns. This was needed
because the NEON floating point semantics meant that their use by the
compiler for HFmode arithmetic operations needed to be restricted. This
follows the implementation for 32-bit NEON intructions.

As with the 32-bit instructions, the restriction on the HFmode
operation can be lifted when -funsafe-math-optimizations is
enabled. This patch does that, defining the standard pattern names
addhf3, subhf3, mulhf3 and fmahf3.

This patch also updates the NEON intrinsics to use the arithmetic
operations when -ffast-math is enabled. This is to make keep the 16-bit
support consistent with the 32-bit supportd. It is needed so that code
using the f16 intrinsics are subject to the same optimizations as code
using the f32 intrinsics would be.

Tested for arm-none-linux-gnueabihf with native bootstrap and make check
on ARMv8-A and for arm-none-eabi and armeb-none-eabi with cross-compiled
make check on an ARMv8.2-A emulator.

Ok for trunk?
Matthew

gcc/
2016-09-23  Matthew Wahab  

* config/arm/arm_neon.h (vadd_f16): Use standard arithmetic
operations in fast-math mode.
(vaddq_f16): Likewise.
(vmul_f16): Likewise.
(vmulq_f16): Likewise.
(vsub_f16): Likewise.
(vsubq_f16): Likewise.
* config/arm/neon.md (add3): New.
(sub3): New.
(fma:3): New.  Also remove outdated comment.
(mul3): New.

testsuite/
2016-09-23  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-arith-1.c: Expand comment.  Update
expected output of vadd, vsub and vmul instructions.
* gcc.target/arm/armv8_2-fp16-arith-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-3.c: New.


Re: [PATCH][GCC][AARCH64]Adjust costs so udiv is preferred over sdiv when both are valid. [Patch (1/2)]

2017-05-15 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 4:37:16 PM
To: GCC Patches
Cc: nd; Richard Earnshaw; Marcus Shawcroft; James Greenhalgh
Subject: [PATCH][GCC][AARCH64]Adjust costs so udiv is preferred over sdiv when 
both are valid. [Patch (1/2)]

Hi All,

This patch adjusts the cost model so that when both sdiv and udiv are possible
it prefers udiv over sdiv. This was done by making sdiv slightly more expensive
instead of making udiv cheaper to keep the baseline costs of a division the same
as before.

For aarch64 this patch along with my other two related mid-end changes
makes a big difference in division by constants.

Given:

int f2(int x)
{
  return ((x * x) % 300) + ((x * x) / 300);
}

we now generate

f2:
mul w0, w0, w0
mov w1, 33205
movkw1, 0x1b4e, lsl 16
mov w2, 300
umull   x1, w0, w1
lsr x1, x1, 37
msubw0, w1, w2, w0
add w0, w0, w1
ret

as opposed to

f2:
mul w0, w0, w0
mov w2, 33205
movkw2, 0x1b4e, lsl 16
mov w3, 300
smull   x1, w0, w2
umull   x2, w0, w2
asr x1, x1, 37
sub w1, w1, w0, asr 31
lsr x2, x2, 37
msubw0, w1, w3, w0
add w0, w0, w2
ret

Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/aarch64/aarch64.c (aarch64_rtx_costs): Make sdiv more 
expensive than udiv.
Remove floating point cases from mod.

gcc/testsuite/
2017-05-02  Tamar Christina  

* gcc.target/aarch64/sdiv_costs_1.c: New.


Re: [PATCH][GCC][ARM] Adjust costs so udiv is preferred over sdiv when both are valid. [Patch (2/2)]

2017-05-15 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 4:37:12 PM
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com
Subject: [PATCH][GCC][ARM] Adjust costs so udiv is preferred over sdiv when 
both are valid. [Patch (2/2)]

Hi All,

This patch adjusts the cost model so that when both sdiv and udiv are possible
it prefers udiv over sdiv. This was done by making sdiv slightly more expensive
instead of making udiv cheaper to keep the baseline costs of a division the same
as before.

Similar to aarch64 this patch along with my other two related mid-end changes
makes a big difference in division by constants.

Given:

int f2(int x)
{
  return ((x * x) % 300) + ((x * x) / 300);
}

we now generate

f2:
mul r3, r0, r0
mov r0, r3
ldr r1, .L3
umull   r2, r3, r0, r1
lsr r2, r3, #5
add r3, r2, r2, lsl #2
rsb r3, r3, r3, lsl #4
sub r0, r0, r3, lsl #2
add r0, r0, r2
bx  lr

as opposed to

f2:
mul r3, r0, r0
mov r0, r3
ldr r3, .L4
push{r4, r5}
smull   r4, r5, r0, r3
asr r3, r0, #31
rsb r3, r3, r5, asr #5
add r2, r3, r3, lsl #2
rsb r2, r2, r2, lsl #4
sub r0, r0, r2, lsl #2
add r0, r0, r3
pop {r4, r5}
bx  lr

Bootstrapped and reg tested on arm-none-eabi
with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/arm/arm.c (arm_rtx_costs_internal): Make sdiv more expensive 
than udiv.


gcc/testsuite/
2017-05-02  Tamar Christina  

* gcc.target/arm/sdiv_costs_1.c: New.


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 4:37:21 PM
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Richard Earnshaw; Marcus Shawcroft; James Greenhalgh; 
ni...@redhat.com; Ramana Radhakrishnan
Subject: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

Hi All,

This patch adjusts the cost model for Cortex-A53 to increase the costs of
an integer division. The reason for this is that we want to always expand
the division to a multiply when doing a division by constant.

On the Cortex-A53 shifts are modeled to cost 1 instruction,
when doing the expansion we have to perform two shifts and an addition.
However because the cost model can't model things such as fusing of shifts,
we have to fully cost both shifts.

This leads to the cost model telling us that for the Cortex-A53 we can never
do the expansion. By increasing the costs of the division by two instructions
we recover the room required in the cost calculation to do the expansions.

The reason for all of this is that currently the code does not produce what 
you'd expect,
which is that division by constants are always expanded. Also it's inconsistent 
because
unsigned division does get expanded.

This all reduces the ability to do CSE when using signed modulo since that one 
is also expanded.

Given:

void f5(void)
{
  int x = 0;
  while (x > -1000)
  {
g(x % 300);
x--;
  }
}


we now generate

smull   x0, w19, w21
asr x0, x0, 37
sub w0, w0, w19, asr 31
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g

as opposed to

sdivw0, w19, w20
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g


Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
cost.


[Ada] Fix ICE on corner case in Ada.Iterator_Interfaces

2017-05-15 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Eric Botcazou  

* gcc-interface/trans.c (Identifier_to_gnu): Also accept incomplete
types not coming from a limited context.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 248051)
+++ gcc-interface/trans.c	(working copy)
@@ -1044,7 +1044,7 @@ Identifier_to_gnu (Node_Id gnat_node, tr
 		  && (Etype (gnat_node)
 		  == Packed_Array_Impl_Type (gnat_temp_type)))
 	  || (Is_Class_Wide_Type (Etype (gnat_node)))
-	  || (IN (Ekind (gnat_temp_type), Private_Kind)
+	  || (IN (Ekind (gnat_temp_type), Incomplete_Or_Private_Kind)
 		  && Present (Full_View (gnat_temp_type))
 		  && ((Etype (gnat_node) == Full_View (gnat_temp_type))
 		  || (Is_Packed (Full_View (gnat_temp_type))


Re: PR78972, 80283: Extend TER with scheduling

2017-05-15 Thread Bin.Cheng
On Mon, May 15, 2017 at 9:27 AM, Richard Biener
 wrote:
> On Fri, May 12, 2017 at 7:51 PM, Bernd Schmidt  wrote:
>> If you look at certain testcases like the one for PR78972, you'll find that
>> the code generated by TER is maximally pessimal in terms of register
>> pressure: we can generate a large number of intermediate results, and defer
>> all the statements that use them up.
>>
>> Another observation one can make is that doing TER doesn't actually buy us
>> anything for a large subset of the values it finds: only a handful of places
>> in the expand phase actually make use of the information. In cases where we
>> know we aren't going to be making use of it, we could move expressions
>> freely without doing TER-style substitution.
>>
>> This patch uses the information collected by TER about the moveability of
>> statements and performs a mini scheduling pass with the aim of reducing
>> register pressure. The heuristic is fairly simple: something that consumes
>> more values than it produces is preferred. This could be tuned further, but
>> it already produces pretty good results: for the 78972 testcase, the stack
>> size is reduced from 2448 bytes to 288, and for PR80283, the stackframe of
>> 496 bytes vanishes with the pass enabled.
>>
>> In terms of benchmarks I've run SPEC a few times, and the last set of
>> results showed not much of a change. Getting reproducible results has been
>> tricky but all runs I made have been within 0%-1% improvement.
>>
>> In this patch, the changed behaviour is gated with a -fschedule-ter option
>> which is off by default; with that default it bootstraps and tests without
>> regressions. The compiler also bootstraps with the option enabled, in that
>> case there are some optimization issues. I'll address some of them with two
>> followup patches, the remaining failures are:
>>  * a handful of guality/PR43077.c failures
>>Debug insn generation is somewhat changed, and the peephole2 pass
>>drops one of them on the floor.
>>  * three target/i386/bmi-* tests fail. These expect the combiner to
>>build certain instruction patterns, and that turns out to be a
>>little fragile. It would be nice to be able to use match.pd to
>>produce target-specific patterns during expand.
>>
>> Thoughts? Ok to apply?
>
> I appreciate that you experimented with partially disabling TER.  Last year
> I tried to work towards this in a more aggressive way:
>
> https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02062.html
>
> that patch tried to preserve the scheduling effect of TER because there's
> on my list of nice things to have a GIMPLE scheduling pass that should
> try to reduce (SSA) register pressure and that can work with GIMPLE
> data dependences.
>
> One of the goals of the patch above was to actually _see_ the scheduling
> effects in the IL.
>
> So what I'd like to see is a simple single-BB scheduling pass right before
> RTL expansion (so we get a dump file).  That can use your logic (and
I had a simple scheduler pass based on register pressure patches
posted last week, but it's totally based on live range information.
> "TERable" would be simply having single-uses).  The advantage of doing
> this before RTL expansion is that coalescing can benefit from the scheduling
> as well.
>
> Then simply disable TER for the decide_schedule_stmt () defs during
> RTL expansion.
>
> That means the effect of TER scheduling is not fully visible but we're
> a step closer.  It also means that some of the scheduling we did
> in the simple scheduler persists anyway because coalescing / TER
> wasn't going to undo it anyway.
>
> In the (very) distant future I'd like to perform (more) instruction selection
> on GIMPLE so that all the benefits of TER are applied before RTL
> expansion.
>
> +  tree_code c = gimple_assign_rhs_code (use_stmt);
> +  if (TREE_CODE_CLASS (c) != tcc_comparison
> + && c != FMA_EXPR
> + && c != SSA_NAME
> + && c != MEM_REF
> + && c != TARGET_MEM_REF
> + && def_c != VIEW_CONVERT_EXPR)
>
> I think on some archs it was important to handle combining
> POINTER_PLUS_EXPR with NEGATE_EXPR of the offset.
>
> Anyway, the effects of TER and where it matters are hard to
> see given its recursive nature (and the history of trying to
> preserve expanding of "large" GENERIC trees ...).  One would
> think combine should be able to handle all those cases
> (for example the FMA_EXPR one from above), but it clearly
> isn't (esp. in the case of forwarding memory references).
Another example on aarch64 is TER can generate conditional compare
(ccmp) instructions, while combine can't if TER was disabled.

Thanks,
bin
>
> Richard.
>
>>
>> Bernd


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Ramana Radhakrishnan
On Tue, May 2, 2017 at 4:37 PM, Tamar Christina  wrote:
> Hi All,
>
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
>
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
>
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
>
> The reason for all of this is that currently the code does not produce what 
> you'd expect,
> which is that division by constants are always expanded. Also it's 
> inconsistent because
> unsigned division does get expanded.
>
> This all reduces the ability to do CSE when using signed modulo since that 
> one is also expanded.
>
> Given:
>
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
>
>
> we now generate
>
> smull   x0, w19, w21
> asr x0, x0, 37
> sub w0, w0, w19, asr 31
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
> as opposed to
>
> sdivw0, w19, w20
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
>
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

Since this affects the arm port as well, it needs to be regression
tested on arm as well.

Thanks,
Ramana

>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-05-02  Tamar Christina  
>
> * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase 
> idiv cost.


[Ada] Small speedup for simple functions returning unconstrained array

2017-05-15 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Eric Botcazou  

* gcc-interface/trans.c (return_value_ok_for_nrv_p): Only apply the
addressability check in the constrained case.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 248052)
+++ gcc-interface/trans.c	(working copy)
@@ -3238,8 +3238,9 @@ Loop_Statement_to_gnu (Node_Id gnat_node
  RETURN_EXPR [ = Ri]
[...]
 
-   and we try to fulfill a simple criterion that would make it possible to
-   replace one or several Ri variables with the RESULT_DECL of the function.
+   where the Ri are not addressable and we try to fulfill a simple criterion
+   that would make it possible to replace one or several Ri variables by the
+   single RESULT_DECL of the function.
 
The first observation is that RETURN_EXPRs that don't directly reference
any of the Ri variables on the RHS of their assignment are transparent wrt
@@ -3271,8 +3272,8 @@ Loop_Statement_to_gnu (Node_Id gnat_node
because the anonymous return object is allocated on the secondary stack
and RESULT_DECL is only a pointer to it.  Each return object can be of a
different size and is allocated separately so we need not care about the
-   aforementioned overlapping issues.  Therefore, we don't collect the other
-   expressions and skip step #2 in the algorithm.  */
+   addressability and the aforementioned overlapping issues.  Therefore, we
+   don't collect the other expressions and skip step #2 in the algorithm.  */
 
 struct nrv_data
 {
@@ -3612,7 +3613,8 @@ return_value_ok_for_nrv_p (tree ret_obj,
   if (TREE_STATIC (ret_val))
 return false;
 
-  if (TREE_ADDRESSABLE (ret_val))
+  /* For the constrained case, test for addressability.  */
+  if (ret_obj && TREE_ADDRESSABLE (ret_val))
 return false;
 
   /* For the constrained case, test for overalignment.  */


Re: [ARM] Enable FP16 vector arithmetic operations.

2017-05-15 Thread Kyrill Tkachov

Hi Tamar,

On 02/05/17 15:46, Tamar Christina wrote:

Hi All,

I'm taking this one over from Matthew, I think it slipped through the cracks 
before.

Since it still applies cleanly on trunk I'm just pinging it.

Ok for trunk?



Sorry for missing this.
For the record you are referring to the patch at:
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01700.html

This is ok and in line with what we do for the f32 intrinsics.
My only concern was that we can do this only if 
__ARM_FEATURE_FP16_VECTOR_ARITHMETIC
is defined from the architecture/fpu level, but these intrinsics are already
gated on that in arm_neon.h.

This is ok for trunk if a bootstrap and test run on arm-none-linux-gnueabihf 
with
current trunk shows no issues.

Thanks,
Kyrill


Tamar

From: gcc-patches-ow...@gcc.gnu.org  on behalf of 
Matthew Wahab 
Sent: Friday, September 23, 2016 4:02 PM
To: gcc-patches
Subject: [ARM] Enable FP16 vector arithmetic operations.

Hello,

Support for the ARMv8.2-A FP16 NEON arithmetic instructions was added
using non-standard names for the instruction patterns. This was needed
because the NEON floating point semantics meant that their use by the
compiler for HFmode arithmetic operations needed to be restricted. This
follows the implementation for 32-bit NEON intructions.

As with the 32-bit instructions, the restriction on the HFmode
operation can be lifted when -funsafe-math-optimizations is
enabled. This patch does that, defining the standard pattern names
addhf3, subhf3, mulhf3 and fmahf3.

This patch also updates the NEON intrinsics to use the arithmetic
operations when -ffast-math is enabled. This is to make keep the 16-bit
support consistent with the 32-bit supportd. It is needed so that code
using the f16 intrinsics are subject to the same optimizations as code
using the f32 intrinsics would be.

Tested for arm-none-linux-gnueabihf with native bootstrap and make check
on ARMv8-A and for arm-none-eabi and armeb-none-eabi with cross-compiled
make check on an ARMv8.2-A emulator.

Ok for trunk?
Matthew

gcc/
2016-09-23  Matthew Wahab  

* config/arm/arm_neon.h (vadd_f16): Use standard arithmetic
operations in fast-math mode.
(vaddq_f16): Likewise.
(vmul_f16): Likewise.
(vmulq_f16): Likewise.
(vsub_f16): Likewise.
(vsubq_f16): Likewise.
* config/arm/neon.md (add3): New.
(sub3): New.
(fma:3): New.  Also remove outdated comment.
(mul3): New.

testsuite/
2016-09-23  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-arith-1.c: Expand comment.  Update
expected output of vadd, vsub and vmul instructions.
* gcc.target/arm/armv8_2-fp16-arith-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-2.c: New.
* gcc.target/arm/armv8_2-fp16-neon-3.c: New.




[Ada] Fix ICE on instantiation of packed array element

2017-05-15 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2017-05-15  Pierre-Marie de Rodat  

* gcc-interface/utils.c (can_materialize_object_renaming_p):
Synchronize with GNAT's Exp_Dbug.Debug_Renaming_Declaration:
process Original_Node instead of expanded names.


2017-05-15  Pierre-Marie de Rodat  

* gnat.dg/specs/pack13.ads: New test.

-- 
Eric BotcazouIndex: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 248051)
+++ gcc-interface/utils.c	(working copy)
@@ -5431,11 +5431,16 @@ can_materialize_object_renaming_p (Node_
 {
   while (true)
 {
+  expr = Original_Node (expr);
+
   switch Nkind (expr)
 	{
 	case N_Identifier:
 	case N_Expanded_Name:
-	  return true;
+	  if (!Present (Renamed_Object (Entity (expr
+	return true;
+	  expr = Renamed_Object (Entity (expr));
+	  break;
 
 	case N_Selected_Component:
 	  {
-- { dg-do compile }

package Pack13 is

  generic
type Value_Type is private;
Value : in out Value_Type;
  package G is end G;

  type Rec is record
B : Boolean;
  end record;
  for Rec use record
B at 0 range 8 .. 8;
  end record;
  for Rec'size use 9;

  type Arr is array (Boolean) of Rec;
  pragma Pack (Arr);

  A : Arr;

  package My_G is new G (Boolean, A(True).B);

end Pack13;


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Tamar Christina
Hi,

Reg-tested now on arm-none-linux-gnueabihf as well and no regressions.

Ok for trunk?
Tamar

From: Ramana Radhakrishnan 
Sent: Monday, May 15, 2017 9:40:40 AM
To: Tamar Christina
Cc: GCC Patches; nd; Kyrylo Tkachov; Richard Earnshaw; Marcus Shawcroft; James 
Greenhalgh; ni...@redhat.com; Ramana Radhakrishnan
Subject: Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

On Tue, May 2, 2017 at 4:37 PM, Tamar Christina  wrote:
> Hi All,
>
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
>
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
>
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
>
> The reason for all of this is that currently the code does not produce what 
> you'd expect,
> which is that division by constants are always expanded. Also it's 
> inconsistent because
> unsigned division does get expanded.
>
> This all reduces the ability to do CSE when using signed modulo since that 
> one is also expanded.
>
> Given:
>
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
>
>
> we now generate
>
> smull   x0, w19, w21
> asr x0, x0, 37
> sub w0, w0, w19, asr 31
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
> as opposed to
>
> sdivw0, w19, w20
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
>
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

Since this affects the arm port as well, it needs to be regression
tested on arm as well.

Thanks,
Ramana

>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-05-02  Tamar Christina  
>
> * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase 
> idiv cost.


[RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Martin Liška
Hello.

Thanks Martin for feedback! After I spent quite some time with fiddling with
the options, I'm not convinced we should convert options to more hierarchical
structure. There's description:

1) -fopt-info is used to dump optimization options. One can pick both verbosity
(note, optimization, all) and an optimization (ipa, inline, vec,...). Thus said
it's probably not a candidate for hierarchical options?

2) -fdump-pass_name-... as mentioned by Nathan is combination of verbosity
(graph, note, verbose, details) and specific type of options (VOPS, RHS_ONLY, 
UID,..).

There's a complete list and suggestion how we can move it to more hierarchical 
ordering:

#define TDF_ADDRESS
#define TDF_SLIM
#define TDF_RAW
#define TDF_DETAILS
#define TDF_STATS
#define TDF_BLOCKS
#define TDF_VOPS
#define TDF_LINENO
#define TDF_UID
#define TDF_TREE - remove & replace with DI_kind
#define TDF_RTL - remove & replace with DI_kind
#define TDF_IPA - remove & replace with DI_kind
#define TDF_STMTADDR - merge with TDF_ADDRESS
#define TDF_GRAPH
#define TDF_MEMSYMS
#define TDF_DIAGNOSTIC - merge with TDF_DETAILS
#define TDF_VERBOSE - merge with TDF_DETAILS
#define TDF_RHS_ONLY
#define TDF_ASMNAME
#define TDF_EH
#define TDF_NOUID
#define TDF_ALIAS
#define TDF_ENUMERATE_LOCALS
#define TDF_CSELIB
#define TDF_SCEV
#define TDF_COMMENT - remove and dump ';; ' unconditionally
#define TDF_GIMPLE

and more hierarchical ordering can be:

#define TDF_ADDRESS
#define TDF_SLIM
#define TDF_RAW
#define TDF_DETAILS
#define TDF_STATS
#define TDF_BLOCKS
#define TDF_LINENO
#define TDF_UID
#define TDF_GRAPH
#define TDF_ASMNAME
#define TDF_NOUID
#define TDF_ENUMERATE_LOCALS

#define TDF_GIMPLE
#define TDF_GIMPLE_FE - GIMPLE front-end
#define TDF_GIMPLE_VOPS
#define TDF_GIMPLE_EH
#define TDF_GIMPLE_ALIAS
#define TDF_GIMPLE_SCEV
#define TDF_GIMPLE_MEMSYMS
#define TDF_GIMPLE_RHS_ONLY

#define TDF_RTL
#define TDF_RTL_CSELIB

I already discussed that with Richi, but I would like to receive a feedback 
about TDF_ clean
and about -fopt-info.

Thanks,
Martin


Re: C PATCH to kill c_save_expr or towards delayed folding for the C FE

2017-05-15 Thread Marek Polacek
On Fri, May 12, 2017 at 09:48:28PM +0200, Jakub Jelinek wrote:
> On Fri, May 12, 2017 at 09:37:27PM +0200, Marek Polacek wrote:
> > @@ -565,6 +564,25 @@ c_fully_fold_internal (tree expr, bool in_init, bool 
> > *maybe_const_operands,
> >  appropriate in any particular case.  */
> >gcc_unreachable ();
> >  
> > +case SAVE_EXPR:
> > +  /* Make sure to fold the contents of a SAVE_EXPR exactly once.  */
> > +  if (!SAVE_EXPR_FOLDED_P (expr))
> > +   {
> > + op0 = TREE_OPERAND (expr, 0);
> > + op0 = c_fully_fold_internal (op0, in_init, maybe_const_operands,
> > +  maybe_const_itself, for_int_const);
> > + /* Don't wrap the folded tree in a SAVE_EXPR if we don't
> > +have to.  */
> > + if (tree_invariant_p (op0))
> > +   ret = op0;
> > + else
> > +   {
> > + TREE_OPERAND (expr, 0) = op0;
> > + SAVE_EXPR_FOLDED_P (expr) = true;
> > +   }
> > +   }
> 
> Wouldn't it be better to guard with if (!SAVE_EXPR_FOLDED_P (expr))
> only c_fully_fold_internal recursion on the operand
> and then use if (tree_invariant_p (op0)) unconditionally?
 
I don't see why that would be better.  It would mean calling tree_invariant_p
on even folded SAVE_EXPRs, and I can't see when it could be true in that case?

> > @@ -113,6 +113,10 @@ along with GCC; see the file COPYING3.  If not see
> > subexpression meaning it is not a constant expression.  */
> >  #define CONSTRUCTOR_NON_CONST(EXPR) TREE_LANG_FLAG_1 (CONSTRUCTOR_CHECK 
> > (EXPR))
> >  
> > +/* For a SAVE_EXPR, nonzero if the contents of the SAVE_EXPR have already
> > +   been folded.  */
> 
> s/contents/operand/;s/have/has/ ?
 
Ok, if you prefer ;).

> Otherwise I'm all for this, but would like to give you and Joseph as C FE
> maintainers the last word on this.

Thanks,

Marek


[PATCH] Do not allow empty argument of -o option (PR driver/31468).

2017-05-15 Thread Martin Liška
Hello.

This is fix for old issues which can be still exposed.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 0cbecc941d9b53de8235d5147ee3891d57af5f49 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 May 2017 12:16:45 +0200
Subject: [PATCH] Do not allow empty argument of -o option (PR driver/31468).

gcc/ChangeLog:

2017-05-03  Martin Liska  

	PR driver/31468
	* gcc.c (process_command): Do not allow empty argument of -o option.
---
 gcc/gcc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 826b012cd77..c68eebea92f 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4472,6 +4472,9 @@ process_command (unsigned int decoded_options_count,
 		   output_file);
 }
 
+  if (output_file != NULL && output_file[0] == '\0')
+fatal_error (input_location, "output file could not be empty");
+
   /* If -save-temps=obj and -o name, create the prefix to use for %b.
  Otherwise just make -save-temps=obj the same as -save-temps=cwd.  */
   if (save_temps_flag == SAVE_TEMPS_OBJ && save_temps_prefix != NULL)
-- 
2.12.2



[PATCH] Fix VAR_DECL w/o a BIND_EXPR (PR sanitize/80659).

2017-05-15 Thread Martin Liška
Hello.

There are situations where local variables (defined in a switch scope) do
not belong to any BIND_EXPR. Thus, we ICE due to gcc_assert 
(gimplify_ctxp->live_switch_vars->elements () == 0);

Is there any better solution how we can catch these variables?

Suggested patch can bootstrap on ppc64le-redhat-linux and survives regression 
tests.

Ready to be installed?
Martin
>From 3d906f714f9e56d1d8bc4c70464699c0742dc08c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 12 May 2017 14:08:49 +0200
Subject: [PATCH] Fix VAR_DECL w/o a BIND_EXPR (PR sanitize/80659).

gcc/ChangeLog:

2017-05-12  Martin Liska  

	PR sanitize/80659
	* gimplify.c (gimplify_switch_expr): Do not assert as
	we can have a VAR_DECL that does not belong to a BIND_EXPR.

gcc/testsuite/ChangeLog:

2017-05-12  Martin Liska  

	PR sanitize/80659
	* gcc.dg/asan/pr80659.c: New test.
---
 gcc/gimplify.c  |  5 +
 gcc/testsuite/gcc.dg/asan/pr80659.c | 10 ++
 2 files changed, 11 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asan/pr80659.c

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index a28a9af3b7f..32b1c9dfde2 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -2297,10 +2297,7 @@ gimplify_switch_expr (tree *expr_p, gimple_seq *pre_p)
   gimplify_ctxp->case_labels = saved_labels;
 
   if (gimplify_ctxp->live_switch_vars)
-	{
-	  gcc_assert (gimplify_ctxp->live_switch_vars->elements () == 0);
-	  delete gimplify_ctxp->live_switch_vars;
-	}
+	delete gimplify_ctxp->live_switch_vars;
   gimplify_ctxp->live_switch_vars = saved_live_switch_vars;
 
   preprocess_case_label_vec_for_gimple (labels, index_type,
diff --git a/gcc/testsuite/gcc.dg/asan/pr80659.c b/gcc/testsuite/gcc.dg/asan/pr80659.c
new file mode 100644
index 000..03281e9c221
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/asan/pr80659.c
@@ -0,0 +1,10 @@
+/* PR sanitizer/80659 */
+/* { dg-do compile } */
+
+void foo(int a)
+{
+  switch (a) {
+(int[3]){}; /* { dg-warning "statement will never be executed" } */
+int h;
+  }
+}
-- 
2.12.2



Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

2017-05-15 Thread Tamar Christina
Resending the message to list, Sorry I hadn't noticed somewhere along the line 
it was dropped..

Hi All,

I believe Joseph had no more comments for the patch.

Any other comments or OK for trunk?

Regards,
Tamar

From: Tamar Christina
Sent: Tuesday, May 2, 2017 10:09 AM
To: Bernhard Reutner-Fischer; Joseph Myers
Cc: Jeff Law; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

Ping.

Hi All,

Since this didn't manage to get in for GCC 7, I'm looking for approval to 
commit to trunk.

Cheers,
Tamar

From: Tamar Christina
Sent: Thursday, February 9, 2017 1:56:30 PM
To: Bernhard Reutner-Fischer; Joseph Myers
Cc: Jeff Law; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

Ping. If not for now can I get approval for when stage1 opens again?

Cheers,
Tamar

From: Tamar Christina
Sent: Monday, January 30, 2017 10:01:33 AM
To: Bernhard Reutner-Fischer; Joseph Myers
Cc: Jeff Law; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

Ping?

You initially approved the patch Jeff but there has been some minor changes 
since then.

Regards,
Tamar


From: Tamar Christina
Sent: Wednesday, January 25, 2017 9:38:57 AM
To: Bernhard Reutner-Fischer; Joseph Myers
Cc: Jeff Law; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

Thanks, updated.

Aside from the changelog, Ok for trunk?

Thanks,
Tamar


From: Bernhard Reutner-Fischer 
Sent: Tuesday, January 24, 2017 3:43:54 PM
To: Tamar Christina; Joseph Myers
Cc: Jeff Law; Wilco Dijkstra; rguent...@suse.de; Michael Meissner; nd
Subject: Re: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers 
in GIMPLE.

On 23 January 2017 16:32:20 CET, Tamar Christina  
wrote:

>gcc/
>2017-01-23  Tamar Christina  
>
>   PR middle-end/77925
>   PR middle-end/77926
>   PR middle-end/66462
>
>   * gcc/builtins.c (fold_builtin_fpclassify): Removed.
>   (fold_builtin_interclass_mathfn): Removed.
>   (expand_builtin): Added builtins to lowering list.
>   (fold_builtin_n): Removed fold_builtin_varargs.

Present tense in ChangeLog please.



[RFC] propagate malloc attribute in ipa-pure-const pass

2017-05-15 Thread Prathamesh Kulkarni
Hi,
I have attached a bare-bones prototype patch that propagates malloc attribute in
ipa-pure-const. As far as I understand, from the doc a function could
be annotated
with malloc attribute if it returns a pointer to a newly allocated
memory block (uninitialized or zeroed) and the pointer does not alias
any other valid pointer ?

* Analysis
The analysis used by the patch in malloc_candidate_p is too conservative,
and I would be grateful for suggestions on improving it.
Currently it only checks if:
1) The function returns a value of pointer type.
2) SSA_NAME_DEF_STMT (return_value) is either a function call or a phi, and
SSA_NAME_DEF_STMT(each element of phi) is function call.
3) The return-value has immediate uses only within comparisons (gcond
or gassign) and return_stmt (and likewise a phi arg has immediate use only
within comparison or the phi stmt).

The intent is that malloc_candidate_p(node) returns true if node
returns the return value of it's callee, and the return value is only
used for comparisons within node.
Then I assume it's safe (although conservative) to decide that node
could possibly be a malloc function if callee is found to be malloc ?

for eg:
void *f(size_t n)
{
  void *p = __builtin_malloc (n);
  if (p == 0)
__builtin_abort ();
  return p;
}

malloc_candidate_p would determine f to be a candidate for malloc
attribute since it returns the return-value of it's callee
(__builtin_malloc) and the return value is used only within comparison
and return_stmt.

However due to the imprecise analysis it misses this case:
char  **g(size_t n)
{
  char **p = malloc (sizeof(char *) * 10);
  for (int i = 0; i < 10; i++)
p[i] = malloc (sizeof(char) * n);
  return p;
}
I suppose g() could be annotated with malloc here ?

The patch uses return_calles_map which is a hash_map such
that the return value of node is return value of one of these callees.
For above eg it would be 
The analysis phase populates return_callees_map, and the propagation
phase uses it to take the "meet" of callees.

* LTO and memory management
This is a general question about LTO and memory management.
IIUC the following sequence takes place during normal LTO:
LGEN: generate_summary, write_summary
WPA: read_summary, execute ipa passes, write_opt_summary

So I assumed it was OK in LGEN to allocate return_callees_map in
generate_summary and free it in write_summary and during WPA, allocate
return_callees_map in read_summary and free it after execute (since
write_opt_summary does not require return_callees_map).

However with fat LTO, it seems the sequence changes for LGEN with
execute phase takes place after write_summary. However since
return_callees_map is freed in pure_const_write_summary and
propagate_malloc() accesses it in execute stage, it results in
segmentation fault.

To work around this, I am using the following hack in pure_const_write_summary:
// FIXME: Do not free if -ffat-lto-objects is enabled.
if (!global_options.x_flag_fat_lto_objects)
  free_return_callees_map ();
Is there a better approach for handling this ?

Also I am not sure if I have written cgraph_node::set_malloc_flag[_1] correctly.
I tried to imitate cgraph_node::set_const_flag.

The patch passes bootstrap+test on x86_64 and found a few functions in
the source tree (attached func_names.txt) that could be annotated with
malloc (I gave a brief look at some of the functions and didn't appear
to be false positives but I will recheck thoroughly)

Does the patch look in the right direction ?
I would be grateful for suggestions on improving it.

Thanks,
Prathamesh
virtual char* libcp1::compiler::find(std::__cxx11::string&) const
gomp_malloc
virtual char* libcc1::compiler::find(std::__cxx11::string&) const
void* GTM::xrealloc(void*, size_t, bool)
concat
char* gen_internal_sym(const char*)
void* ira_allocate(size_t)
char* gen_fake_label()
char* selftest::locate_file(const char*)
const char* find_plugindir_spec_function(int, const char**)
reconcat
xvasprintf
char* rtx_reader::read_until(const char*, bool)
_Tp* 
__gnu_cxx::__detail::__mini_vector<_Tp>::allocate(__gnu_cxx::__detail::__mini_vector<_Tp>::size_type)
 [with _Tp = long unsigned int*]
xstrndup
const char* replace_extension_spec_func(int, const char**)
void* GTM::xcalloc(size_t, bool)
_Tp* 
__gnu_cxx::__detail::__mini_vector<_Tp>::allocate(__gnu_cxx::__detail::__mini_vector<_Tp>::size_type)
 [with _Tp = unsigned int*]
xstrdup
void* GTM::xmalloc(size_t, bool)
void* base_pool_allocator::allocate() [with TBlockAllocator = 
memory_block_pool]
char* ix86_offload_options()
basic_block_def* alloc_block()
xmemdup
char* build_message_string(const char*, ...)
make_relative_prefix
gomp_malloc_cleared
make_relative_prefix_ignore_links
choose_temp_base
make_temp_file
xasprintf
char* file_name_as_prefix(diagnostic_context*, const char*)
void* yyalloc(yy_size_t)
void* ggc_internal_cleared_alloc(size_t, void (*)(void*), size_t, size_t)
void* __cilkrts_hyperobject_alloc(void*, std::size_t)
void* alloc_for_identif

Re: [C++ PATCH, RFC] Implement new C++ intrinsics __is_assignable and __is_constructible.

2017-05-15 Thread Jonathan Wakely

On 12/05/17 21:33 +0300, Ville Voutilainen wrote:

   libstdc++-v3/

   Implement new C++ intrinsics __is_assignable and __is_constructible.
   * include/std/type_traits (__do_is_static_castable_impl): Remove.
   (__is_static_castable_impl, __is_static_castable_safe): Likewise.
   (__is_static_castable, __do_is_direct_constructible_impl): Likewise.
   (__is_direct_constructible_impl): Likewise.
   (__is_direct_constructible_new_safe): Likewise.
   (__is_base_to_derived_ref, __is_lvalue_to_rvalue_ref): Likewise.
   (__is_direct_constructible_ref_cast): Likewise.
   (__is_direct_constructible_new, __is_direct_constructible): Likewise.
   (__do_is_nary_constructible_impl): Likewise.
   (__is_nary_constructible_impl, __is_nary_constructible): Likewise.
   (__is_constructible_impl): Likewise.
   (is_constructible): Call the intrinsic.
   (__is_assignable_helper): Remove.
   (is_assignable): Call the intrinsic.
   (is_trivially_constructible): Likewise.
   (is_trivially_assignable): Likewise.
   (testsuite/20_util/declval/requirements/1_neg.cc): Adjust.
   (testsuite/20_util/make_signed/requirements/typedefs_neg.cc): Likewise.
   (testsuite/20_util/make_unsigned/requirements/typedefs_neg.cc):
   Likewise.


The libstdc++ parts make me happy. I have fairly high confidence in
our tests for is_constructible and is_assignable (thanks, Daniel!) so
would be happy to approve the library parts.

I'll try the patch against the libc++ testsuite too.




[Patch, fortran] PR80554 [f08] variable redefinition in submodule

2017-05-15 Thread Paul Richard Thomas
The attached bootstraps and regtests on FC23/x86_64 - OK for trunk and
later for 7-branch?

The comment in the patch and the ChangeLog are sufficiently clear that
no further explanation is needed here.

Cheers

Paul

2017-05-15  Paul Thomas  

PR fortran/80554
* decl.c (build_sym): In a submodule allow overriding of host
associated symbols from the ancestor module with a new
declaration.

2017-05-15  Paul Thomas  

PR fortran/80554
* gfortran.dg/submodule_29.f08: New test.
Index: gcc/fortran/decl.c
===
*** gcc/fortran/decl.c  (revision 246951)
--- gcc/fortran/decl.c  (working copy)
*** build_sym (const char *name, gfc_charlen
*** 1383,1390 
symbol_attribute attr;
gfc_symbol *sym;
int upper;
  
!   if (gfc_get_symbol (name, NULL, &sym))
  return false;
  
/* Check if the name has already been defined as a type.  The
--- 1383,1410 
symbol_attribute attr;
gfc_symbol *sym;
int upper;
+   gfc_symtree *st;
  
!   /* Symbols in a submodule are host associated from the parent module or
!  submodules. Therefore, they can be overridden by declarations in the
!  submodule scope. Deal with this by attaching the existing symbol to
!  a new symtree and recycling the old symtree with a new symbol...  */
!   st = gfc_find_symtree (gfc_current_ns->sym_root, name);
!   if (st != NULL && gfc_state_stack->state == COMP_SUBMODULE
!   && st->n.sym != NULL
!   && st->n.sym->attr.host_assoc && st->n.sym->attr.used_in_submodule)
! {
!   gfc_symtree *s = gfc_get_unique_symtree (gfc_current_ns);
!   s->n.sym = st->n.sym;
!   sym = gfc_new_symbol (name, gfc_current_ns);
! 
! 
!   st->n.sym = sym;
!   sym->refs++;
!   gfc_set_sym_referenced (sym);
! }
!   /* ...Otherwise generate a new symtree and new symbol.  */
!   else if (gfc_get_symbol (name, NULL, &sym))
  return false;
  
/* Check if the name has already been defined as a type.  The
Index: gcc/testsuite/gfortran.dg/submodule_29.f08
===
*** gcc/testsuite/gfortran.dg/submodule_29.f08  (nonexistent)
--- gcc/testsuite/gfortran.dg/submodule_29.f08  (working copy)
***
*** 0 
--- 1,56 
+ ! { dg-do run }
+ !
+ ! Test the fix for PR80554 in which it was not recognised that the symbol 'i'
+ ! is host associated in the submodule 's' so that the new declaration in the
+ ! submodule was rejected.
+ !
+ ! Contributed by Tamas Bela Feher  
+ !
+ module M
+   implicit none
+   integer :: i = 0
+   character (100) :: buffer
+   interface
+ module subroutine write_i()
+ end subroutine
+   end interface
+   interface
+ module subroutine write_i_2()
+ end subroutine
+   end interface
+ contains
+   subroutine foo
+ integer :: i
+   end
+ end module
+ 
+ submodule (M) S
+ integer :: i = 137
+   contains
+ module subroutine write_i()
+write (buffer,*) i
+ end subroutine
+ end submodule
+ 
+ submodule (M:S) S2
+ integer :: i = 1037
+   contains
+ module subroutine write_i_2()
+write (buffer,*) i
+ end subroutine
+ end submodule
+ 
+ program test_submod_variable
+   use M
+   implicit none
+   integer :: j
+   i = 42
+   call write_i
+   read (buffer, *) j
+   if (i .ne. 42) call abort
+   if (j .ne. 137) call abort
+   call write_i_2
+   read (buffer, *) j
+   if (i .ne. 42) call abort
+   if (j .ne. 1037) call abort
+ end program


Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Nathan Sidwell

On 05/15/2017 05:39 AM, Martin Liška wrote:

Thanks Martin for feedback! After I spent quite some time with fiddling with
the options, I'm not convinced we should convert options to more hierarchical


I'll respond to Martin's email properly separates, but while we're on 
dump redesign, here's a WIP patch I whipped up on Friday for the modules 
branch.  This tries to move the language-specific options to a dynamic 
registration mechanism, rather than hard wire them into dumpfile.[hc]. 
There were some awkward pieces due to the current structure of dumpfile 
registration.


I took the -fdump-class-heirachy and -fdump-translation-unit and turned 
them into -fdump-lang-class and -fdump-lang-translation.  Unfortunately 
the current LANG_HOOKS_INIT_OPTIONS is run rather too late to register 
these dumps so that they get nice low numbers.  That's because 
gcc::context::context () both creates the dump manager and then 
immediately creates all the optimization passes:

  m_dumps = new gcc::dump_manager ();
  /* Allow languages to register dumps before passes.  */
  lang_hooks.register_dumps (m_dumps);
  m_passes = new gcc::pass_manager (this);

As you can see i wedged a new lang hook between.  That's a little 
unpleasant -- perhaps

  m_passes = new gcc::pass_manager (this);
should be done later? Or the lang dumps could be unnumbered -- it is 
jarring for them to be numbered as-if succeeding the optimization passes.


(I passed m_dumps as a void * purely to avoid header file jiggery pokery 
at this step)


This patch does allow removing special class_dump_file handling from 
c-family/c-opts.c, which is nice.  It looks like -mdump-tree-original 
might be another candidate for dynamic registration?


thoughts?

nathan

--
Nathan Sidwell
Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c	(revision 247990)
+++ gcc/c-family/c-opts.c	(working copy)
@@ -102,8 +102,6 @@ static size_t include_cursor;
 /* Dump files/flags to use during parsing.  */
 static FILE *original_dump_file = NULL;
 static int original_dump_flags;
-static FILE *class_dump_file = NULL;
-static int class_dump_flags;
 
 /* Whether any standard preincluded header has been preincluded.  */
 static bool done_preinclude;
@@ -1098,10 +1096,9 @@ c_common_parse_file (void)
   for (;;)
 {
   c_finish_options ();
-  /* Open the dump files to use for the original and class dump output
+  /* Open the dump files to use for the original output
  here, to be used during parsing for the current file.  */
   original_dump_file = dump_begin (TDI_original, &original_dump_flags);
-  class_dump_file = dump_begin (TDI_class, &class_dump_flags);
   pch_init ();
   push_file_scope ();
   c_parse_file ();
@@ -1120,11 +1117,6 @@ c_common_parse_file (void)
   dump_end (TDI_original, original_dump_file);
   original_dump_file = NULL;
 }
-  if (class_dump_file)
-{
-  dump_end (TDI_class, class_dump_file);
-  class_dump_file = NULL;
-}
   /* If an input file is missing, abandon further compilation.
 	 cpplib has issued a diagnostic.  */
   if (!this_input_filename)
@@ -1138,17 +1130,10 @@ c_common_parse_file (void)
 FILE *
 get_dump_info (int phase, int *flags)
 {
-  gcc_assert (phase == TDI_original || phase == TDI_class);
-  if (phase == TDI_original)
-{
-  *flags = original_dump_flags;
-  return original_dump_file;
-}
-  else
-{
-  *flags = class_dump_flags;
-  return class_dump_file;
-}
+  gcc_assert (phase == TDI_original);
+  
+  *flags = original_dump_flags;
+  return original_dump_file;
 }
 
 /* Common finish hook for the C, ObjC and C++ front ends.  */
Index: gcc/context.c
===
--- gcc/context.c	(revision 247990)
+++ gcc/context.c	(working copy)
@@ -24,6 +24,8 @@ along with GCC; see the file COPYING3.
 #include "pass_manager.h"
 #include "dumpfile.h"
 #include "realmpfr.h"
+#include "tree.h"
+#include "langhooks.h"
 
 /* The singleton holder of global state: */
 gcc::context *g;
@@ -36,6 +38,8 @@ gcc::context::context ()
  dumps for the various passes), so the dump manager must be set up
  before the pass manager.  */
   m_dumps = new gcc::dump_manager ();
+  /* Allow languages to reguster dumps before passes.  */
+  lang_hooks.register_dumps (m_dumps);
   m_passes = new gcc::pass_manager (this);
 }
 
Index: gcc/cp/class.c
===
--- gcc/cp/class.c	(revision 247990)
+++ gcc/cp/class.c	(working copy)
@@ -36,6 +36,10 @@ along with GCC; see the file COPYING3.
 #include "dumpfile.h"
 #include "gimplify.h"
 #include "intl.h"
+#include "context.h"
+
+/* ID for dumping the class hierarchy.  */
+static int class_dump_id;
 
 /* The number of nested classes being processed.  If we are not in the
scope of any class, this is zero.  */
@@

Re: [PATCH][AARCH64]Simplify call, call_value, sibcall, sibcall_value patterns.

2017-05-15 Thread Renlin Li

Hi Richard,

Thanks! committed with all the comments resolved.

Regards,
Renlin

On 02/05/17 13:53, Richard Earnshaw (lists) wrote:

On 01/12/16 15:39, Renlin Li wrote:

Hi all,

This patch refactors the code used in call, call_value, sibcall,
sibcall_value expanders.

Before the change, the logic is following:

call expander  --> call_internal  --> call_reg/call_symbol
call_vlaue expander--> call_value_internal-->
call_value_reg/call_value_symbol

sibcall expander   --> sibcall_internal   --> sibcall_insn
sibcall_value expander --> sibcall_value_internal --> sibcall_value_insn

After the change, the logic is simplified into:

call expander  --> aarch64_expand_call() --> call_insn
call_value expander--> aarch64_expand_call() --> call_value_insn

sibcall expander   --> aarch64_expand_call() --> sibcall_insn
sibcall_value expander --> aarch64_expand_call() --> sibcall_value_insn

The code are factored out from each expander into aarch64_expand_call ().

This also fixes the two issues Richard Henderson suggests in comments 8:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64971

aarch64-none-elf regression test Okay, aarch64-linux bootstrap Okay.
Okay for trunk?

Regards,
Renlin Li


gcc/ChangeLog:

2016-12-01  Renlin Li  

 * config/aarch64/aarch64-protos.h (aarch64_expand_call): Declare.
 * config/aarch64/aarch64.c (aarch64_expand_call): Define.
 * config/aarch64/constraints.md (Usf): Add long call check.
 * config/aarch64/aarch64.md (call): Use aarch64_expand_call.
 (call_value): Likewise.
 (sibcall): Likewise.
 (sibcall_value): Likewise.
 (call_insn): New.
 (call_value_insn): New.
 (sibcall_insn): Update rtx pattern.
 (sibcall_value_insn): Likewise.
 (call_internal): Remove.
 (call_value_internal): Likewise.
 (sibcall_internal): Likewise.
 (sibcall_value_internal): Likewise.
 (call_reg): Likewise.
 (call_symbol): Likewise.
 (call_value_reg): Likewise.
 (call_value_symbol): Likewise.


new.diff


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 7f67f14..3a5babb 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -305,6 +305,7 @@ bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT);
  bool aarch64_constant_address_p (rtx);
  bool aarch64_emit_approx_div (rtx, rtx, rtx);
  bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
+void aarch64_expand_call (rtx, rtx, bool);
  bool aarch64_expand_movmem (rtx *);
  bool aarch64_float_const_zero_rtx_p (rtx);
  bool aarch64_function_arg_regno_p (unsigned);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 68a3380..c313cf5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4343,6 +4343,51 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, 
unsigned int *p2)
return true;
  }

+/* This function is used by the call expanders of the machine description.
+   RESULT is the register in which the result is returned.  It's NULL for
+   "call" and "sibcall".
+   MEM is the location of the function call.
+   SIBCALL indicates whether this function call is normal call or sibling call.
+   It will generate different pattern accordingly.  */
+
+void
+aarch64_expand_call (rtx result, rtx mem, bool sibcall)
+{
+  rtx call, callee, tmp;
+  rtvec vec;
+  machine_mode mode;
+
+  gcc_assert (MEM_P (mem));
+  callee = XEXP (mem, 0);
+  mode = GET_MODE (callee);
+  gcc_assert (mode == Pmode);
+
+  /* Decide if we should generate indirect calls by loading the
+ 64-bit address of the callee into a register before performing


Drop '64-bit'.  This code should also work for ILP32, where the
addresses are 32-bit.


+ the branch-and-link.  */
+
+  if (GET_CODE (callee) == SYMBOL_REF


Use SYMBOL_REF_P.

OK with those changes.

R.



+  ? (aarch64_is_long_call_p (callee)
+|| aarch64_is_noplt_call_p (callee))
+  : !REG_P (callee))
+  XEXP (mem, 0) = force_reg (mode, callee);
+
+  call = gen_rtx_CALL (VOIDmode, mem, const0_rtx);
+
+  if (result != NULL_RTX)
+call = gen_rtx_SET (result, call);
+
+  if (sibcall)
+tmp = ret_rtx;
+  else
+tmp = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNUM));
+
+  vec = gen_rtvec (2, call, tmp);
+  call = gen_rtx_PARALLEL (VOIDmode, vec);
+
+  aarch64_emit_call_insn (call);
+}
+
  /* Emit call insn with PAT and do aarch64-specific handling.  */

  void
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index bc6d8a2..5682686 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -718,12 +718,6 @@
  ;; Subroutine calls and sibcalls
  ;; ---

-(define_expand "call_internal"
-  [(parallel [(call (match_operand 0 "memory_operand" "")
-   (match_operand 1 "general_operand" ""))
- (use (match_operand 2 "" ""))
-

Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Nathan Sidwell

Martin,
thanks for the write up.

On 05/15/2017 05:39 AM, Martin Liška wrote:
Thanks Martin for feedback! After I spent quite some time with fiddling with > the options, I'm not convinced we should convert options to more 

hierarchical> structure. There's description:

1) -fopt-info is used to dump optimization options. One can pick both verbosity
(note, optimization, all) and an optimization (ipa, inline, vec,...). Thus said
it's probably not a candidate for hierarchical options?


I've never used this option, so have no comment (too lazy to figure out 
if it told me anything I didn't already see in a dump I was groveling over).



2) -fdump-pass_name-... as mentioned by Nathan is combination of verbosity
(graph, note, verbose, details) and specific type of options (VOPS, RHS_ONLY, 
UID,..).

There's a complete list and suggestion how we can move it to more hierarchical 
ordering:

#define TDF_ADDRESS
#define TDF_SLIM
#define TDF_RAW
#define TDF_DETAILS
#define TDF_STATS
#define TDF_BLOCKS
#define TDF_VOPS
#define TDF_LINENO
#define TDF_UID

#define TDF_LANG is now a thing too. it should be DI_kind too

#define TDF_TREE - remove & replace with DI_kind
#define TDF_RTL - remove & replace with DI_kind
#define TDF_IPA - remove & replace with DI_kind
#define TDF_STMTADDR - merge with TDF_ADDRESS
#define TDF_GRAPH
#define TDF_MEMSYMS
#define TDF_DIAGNOSTIC - merge with TDF_DETAILS
#define TDF_VERBOSE - merge with TDF_DETAILS
#define TDF_RHS_ONLY
#define TDF_ASMNAME
#define TDF_EH
#define TDF_NOUID
#define TDF_ALIAS
#define TDF_ENUMERATE_LOCALS
#define TDF_CSELIB
#define TDF_SCEV
#define TDF_COMMENT - remove and dump ';; ' unconditionally
#define TDF_GIMPLE


Looks a good start.

1) The TDF_UID and TDF_NOUID options seem to be inverses of each other. 
Can't we just ditch the latter?


2) We might want to distinguish between enabling dump information that 
is useful to us gcc developers (TDF_DETAILS, say), and that that would 
be useful to end users trying to figure out why some random loop isn't 
being optimized in (say TDF_DIAGNOSTIC).  But if we can't define a 
sensible way of distinguishing then I'm all for not making the distinction.



and more hierarchical ordering can be:

#define TDF_ADDRESS
#define TDF_SLIM
#define TDF_RAW
#define TDF_DETAILS
#define TDF_STATS
#define TDF_BLOCKS
#define TDF_LINENO
#define TDF_UID
#define TDF_GRAPH
#define TDF_ASMNAME
#define TDF_NOUID
#define TDF_ENUMERATE_LOCALS


It'd be nice to name TDF_ENUMERATE_LOCALS without the second _ to avoid 
confusion with the hierarchy you discuss below?  (perhaps TDF_LOCALS?)


I like the idea of naming flags specific to a particular kind of dump 
with the name of that kind of dump.  We do have a mismatch between 
DI_TREE and TDF_GIMPLE though -- is there something sensible we could do 
there?



#define TDF_GIMPLE
#define TDF_GIMPLE_FE - GIMPLE front-end


How might this differ from a new  -fdump-lang-original?  I.e.
(1) why is it a dump-modifier flag, rather than a dump in its own right
(2) if we do need it, name it TDF_GIMPLE_LANG


#define TDF_GIMPLE_VOPS
#define TDF_GIMPLE_EH
#define TDF_GIMPLE_ALIAS
#define TDF_GIMPLE_SCEV
#define TDF_GIMPLE_MEMSYMS
#define TDF_GIMPLE_RHS_ONLY

#define TDF_RTL


How does this differ from the current TDF_RTL meaning?  Is it implying 
'TDF_RTL_ALL'? (same question about TDF_GIMPLE).



#define TDF_RTL_CSELIB


nathan
--
Nathan Sidwell


Re: [PATCH] handling address mode changes inside extract_bit_field

2017-05-15 Thread Joseph Myers
The extra argument to extract_bit_field breaks builds for tilegx-linux-gnu 
and tilepro-linux-gnu (as shown by my glibc bot); there are calls in those 
back ends which haven't been updated.

-- 
Joseph S. Myers
jos...@codesourcery.com


Update _OPENACC value and documentation for OpenACC 2.5

2017-05-15 Thread Thomas Schwinge
Hi!

Given that on gomp-4_0-branch we're now implementing most of what has
been added/changed in OpenACC 2.5, it seemed to make sense to apply the
following to gomp-4_0-branch in r248057:

commit 72ae911aeab1e0664db5737f339fcd81cceff287
Author: tschwinge 
Date:   Mon May 15 11:45:33 2017 +

Update _OPENACC value and documentation for OpenACC 2.5

gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
gcc/fortran/
* cpp.c (cpp_define_builtins): Update "_OPENACC" to "201510".
* gfortran.texi: Update for OpenACC 2.5.
* Intrinsic.texi: Likewise.
* invoke.texi: Likewise.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update.
* gfortran.dg/openacc-define-3.f90: Likewise.
gcc/
* doc/invoke.texi: Update for OpenACC 2.5.
libgomp/
* libgomp.texi: Update for OpenACC 2.5.
* openacc.f90 (openacc_version): Update to "201510".
* openacc_lib.h (openacc_version): Likewise.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Update.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248057 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp  |  4 
 gcc/c-family/ChangeLog.gomp |  4 
 gcc/c-family/c-cppbuiltin.c |  2 +-
 gcc/doc/invoke.texi |  4 +++-
 gcc/fortran/ChangeLog.gomp  |  7 +++
 gcc/fortran/cpp.c   |  2 +-
 gcc/fortran/gfortran.texi   | 16 +---
 gcc/fortran/intrinsic.texi  |  6 +++---
 gcc/fortran/invoke.texi |  4 +---
 gcc/testsuite/ChangeLog.gomp|  5 +
 gcc/testsuite/c-c++-common/cpp/openacc-define-3.c   |  2 +-
 gcc/testsuite/gfortran.dg/openacc-define-3.f90  |  2 +-
 libgomp/ChangeLog.gomp  |  6 ++
 libgomp/libgomp.texi| 21 +++--
 libgomp/openacc.f90 |  2 +-
 libgomp/openacc_lib.h   |  2 +-
 .../libgomp.oacc-fortran/openacc_version-1.f|  2 +-
 .../libgomp.oacc-fortran/openacc_version-2.f90  |  2 +-
 18 files changed, 57 insertions(+), 36 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index a4720c3..d7b50a1 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2017-05-15  Thomas Schwinge  
+
+   * doc/invoke.texi: Update for OpenACC 2.5.
+
 2017-05-14  Thomas Schwinge  
 
* omp-low.c (execute_oacc_device_lower): Remove the parallelism
diff --git gcc/c-family/ChangeLog.gomp gcc/c-family/ChangeLog.gomp
index f975aef..31b273ae 100644
--- gcc/c-family/ChangeLog.gomp
+++ gcc/c-family/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2017-05-15  Thomas Schwinge  
+
+   * c-cppbuiltin.c (c_cpp_builtins): Update "_OPENACC" to "201510".
+
 2017-05-04  Cesar Philippidis  
 
* c-pragma.h (enum pragma_omp_clause): Add
diff --git gcc/c-family/c-cppbuiltin.c gcc/c-family/c-cppbuiltin.c
index 3d4587e..40b14ff 100644
--- gcc/c-family/c-cppbuiltin.c
+++ gcc/c-family/c-cppbuiltin.c
@@ -1228,7 +1228,7 @@ c_cpp_builtins (cpp_reader *pfile)
 cpp_define (pfile, "__SSP__=1");
 
   if (flag_openacc)
-cpp_define (pfile, "_OPENACC=201306");
+cpp_define (pfile, "_OPENACC=201510");
 
   if (flag_openmp)
 cpp_define (pfile, "_OPENMP=201511");
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 299dab1..f70dab2 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -1961,10 +1961,12 @@ freestanding and hosted environments.
 Enable handling of OpenACC directives @code{#pragma acc} in C/C++ and
 @code{!$acc} in Fortran.  When @option{-fopenacc} is specified, the
 compiler generates accelerated code according to the OpenACC Application
-Programming Interface v2.0 @w{@uref{http://www.openacc.org/}}.  This option
+Programming Interface v2.5 @w{@uref{http://www.openacc.org/}}.  This option
 implies @option{-pthread}, and thus is only supported on targets that
 have support for @option{-pthread}.
 
+See @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
+
 @item -fopenacc-dim=@var{geom}
 @opindex fopenacc-dim
 @cindex OpenACC accelerator programming
diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp
index 8a6ae6a..0f71797 100644
--- gcc/fortran/ChangeLog.gomp
+++ gcc/fortran/ChangeLog.gomp
@@ -1,3 +1,10 @@
+2017-05-15  Thomas Schwinge  
+
+   * cpp.c (cpp_define_builtins): Update "_OPENACC" to "201510".
+   * gfortran.texi: Update for OpenACC 2.5.
+   * Intrinsic.texi: Likewise.
+   * invoke.texi: Likewise.
+
 2017-05-14  Thomas Schwinge  
 
*

Documentation changes for OpenACC 2.5 Profiling Interface (was: More OpenACC 2.5 Profiling Interface)

2017-05-15 Thread Thomas Schwinge
Hi!

On Mon, 15 May 2017 08:52:39 +0200, I wrote:
> On Tue, 28 Feb 2017 18:43:36 +0100, I wrote:
> > The 2.5 versions of the OpenACC standard added a new chapter "Profiling
> > Interface".  In r245784, I committed incomplete support to
> > gomp-4_0-branch.  I plan to continue working on this, but wanted to
> > synchronize at this point.
> > 
> > commit b22a85fe7f3daeb48460e7aa28606d0cdb799f69
> > Author: tschwinge 
> > Date:   Tue Feb 28 17:36:03 2017 +
> > 
> > OpenACC 2.5 Profiling Interface (incomplete)
> 
> Committed to gomp-4_0-branch in r248042:
> 
> commit e3720963a1f494b2a0a1b6c28d5eb8bfb7c0d546
> Author: tschwinge 
> Date:   Mon May 15 06:50:17 2017 +
> 
> More OpenACC 2.5 Profiling Interface

Committed to gomp-4_0-branch in r248058:

commit b58008024048f960eed9fd709cbe5d5ea96c
Author: tschwinge 
Date:   Mon May 15 11:45:45 2017 +

Documentation changes for OpenACC 2.5 Profiling Interface

libgomp/
* libgomp.texi (OpenACC Environment Variables): Mention
"ACC_PROFLIB".
(OpenACC Profiling Interface): Update.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@248058 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp |  4 
 libgomp/libgomp.texi   | 21 ++---
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index f36cbfc..3125c99 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,9 @@
 2017-05-15  Thomas Schwinge  
 
+   * libgomp.texi (OpenACC Environment Variables): Mention
+   "ACC_PROFLIB".
+   (OpenACC Profiling Interface): Update.
+
* libgomp.texi: Update for OpenACC 2.5.
* openacc.f90 (openacc_version): Update to "201510".
* openacc_lib.h (openacc_version): Likewise.
diff --git libgomp/libgomp.texi libgomp/libgomp.texi
index 74b98c7..7a3c491 100644
--- libgomp/libgomp.texi
+++ libgomp/libgomp.texi
@@ -2839,13 +2839,15 @@ A.2.1.4.
 @node OpenACC Environment Variables
 @chapter OpenACC Environment Variables
 
-The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
+The variables @env{ACC_DEVICE_TYPE}, @env{ACC_DEVICE_NUM},
+and @code{ACC_PROFLIB}
 are defined by section 4 of the OpenACC specification in version 2.5.
 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 
 @menu
 * ACC_DEVICE_TYPE::
 * ACC_DEVICE_NUM::
+* ACC_PROFLIB::
 * GCC_ACC_NOTIFY::
 @end menu
 
@@ -2871,6 +2873,19 @@ The variable @env{GCC_ACC_NOTIFY} is used for diagnostic 
purposes.
 
 
 
+@node ACC_PROFLIB
+@section @code{ACC_PROFLIB}
+@table @asis
+@item @emph{See also}:
+@ref{OpenACC Profiling Interface}
+
+@item @emph{Reference}:
+@uref{http://www.openacc.org/, OpenACC specification v2.5}, section
+4.3.
+@end table
+
+
+
 @node GCC_ACC_NOTIFY
 @section @code{GCC_ACC_NOTIFY}
 @table @asis
@@ -3095,8 +3110,8 @@ Application Programming Interface}, version 2.5.}
 
 @section Implementation Status and Implementation-Defined Behavior
 
-We're not yet implementing the whole Profiling Interface as defined by
-the OpenACC 2.5 specification.  Also, the specification doesn't
+We're implementing most of the Profiling Interface as defined by
+the OpenACC 2.5 specification.  The specification doesn't
 clearly define some aspects of its Profiling Interface, so we're
 clarifying these as @emph{implementation-defined behavior} here.  We
 already have reported to the OpenACC Technical Committee some issues,


Grüße
 Thomas


Re: [PATCH] Do not allow empty argument of -o option (PR driver/31468).

2017-05-15 Thread Richard Biener
On Mon, May 15, 2017 at 11:52 AM, Martin Liška  wrote:
> Hello.
>
> This is fix for old issues which can be still exposed.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

+  if (output_file != NULL && output_file[0] == '\0')
+fatal_error (input_location, "output file could not be empty");
+

"output filename may not be empty"

Ok with that change.

Richard.

> Martin


Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Martin Liška
On 05/15/2017 01:33 PM, Nathan Sidwell wrote:
> Martin,
> thanks for the write up.
> 
> On 05/15/2017 05:39 AM, Martin Liška wrote:
>> Thanks Martin for feedback! After I spent quite some time with fiddling with 
>> > the options, I'm not convinced we should convert options to more 
> hierarchical> structure. There's description:
>> 1) -fopt-info is used to dump optimization options. One can pick both 
>> verbosity
>> (note, optimization, all) and an optimization (ipa, inline, vec,...). Thus 
>> said
>> it's probably not a candidate for hierarchical options?
> 
> I've never used this option, so have no comment (too lazy to figure out if it 
> told me anything I didn't already see in a dump I was groveling over).

Hi.

Thanks for feedback, I'll mention -fopt-info later in this email.

> 
>> 2) -fdump-pass_name-... as mentioned by Nathan is combination of verbosity
>> (graph, note, verbose, details) and specific type of options (VOPS, 
>> RHS_ONLY, UID,..).
>>
>> There's a complete list and suggestion how we can move it to more 
>> hierarchical ordering:
>>
>> #define TDF_ADDRESS
>> #define TDF_SLIM
>> #define TDF_RAW
>> #define TDF_DETAILS
>> #define TDF_STATS
>> #define TDF_BLOCKS
>> #define TDF_VOPS
>> #define TDF_LINENO
>> #define TDF_UID
> #define TDF_LANG is now a thing too. it should be DI_kind too
>> #define TDF_TREE - remove & replace with DI_kind
>> #define TDF_RTL - remove & replace with DI_kind
>> #define TDF_IPA - remove & replace with DI_kind
>> #define TDF_STMTADDR - merge with TDF_ADDRESS
>> #define TDF_GRAPH
>> #define TDF_MEMSYMS
>> #define TDF_DIAGNOSTIC - merge with TDF_DETAILS
>> #define TDF_VERBOSE - merge with TDF_DETAILS
>> #define TDF_RHS_ONLY
>> #define TDF_ASMNAME
>> #define TDF_EH
>> #define TDF_NOUID
>> #define TDF_ALIAS
>> #define TDF_ENUMERATE_LOCALS
>> #define TDF_CSELIB
>> #define TDF_SCEV
>> #define TDF_COMMENT - remove and dump ';; ' unconditionally
>> #define TDF_GIMPLE
> 
> Looks a good start.
> 
> 1) The TDF_UID and TDF_NOUID options seem to be inverses of each other. Can't 
> we just ditch the latter?

One is used to paper over UIDs in order to preserve -fdebug-compare (I 
believe). And the second one
is used to dump UIDd basically for *_DECL. As any of these is not default, both 
make sense.

> 
> 2) We might want to distinguish between enabling dump information that is 
> useful to us gcc developers (TDF_DETAILS, say), and that that would be useful 
> to end users trying to figure out why some random loop isn't being optimized 
> in (say TDF_DIAGNOSTIC).  But if we can't define a sensible way of 
> distinguishing then I'm all for not making the distinction.

That's probably motivation behind -fopt-info, which should represent 
"Optimization dumps", readable by user.
To be honest, just a part of optimizations can be found in these files 
(vectorization and loop optimization).
The rest lives in normal -fdump-xyz*. Maybe this can be opportunity to clean it 
up?

> 
>> and more hierarchical ordering can be:
>>
>> #define TDF_ADDRESS
>> #define TDF_SLIM
>> #define TDF_RAW
>> #define TDF_DETAILS
>> #define TDF_STATS
>> #define TDF_BLOCKS
>> #define TDF_LINENO
>> #define TDF_UID
>> #define TDF_GRAPH
>> #define TDF_ASMNAME
>> #define TDF_NOUID
>> #define TDF_ENUMERATE_LOCALS
> 
> It'd be nice to name TDF_ENUMERATE_LOCALS without the second _ to avoid 
> confusion with the hierarchy you discuss below?  (perhaps TDF_LOCALS?)

Yep, works for me.

> 
> I like the idea of naming flags specific to a particular kind of dump with 
> the name of that kind of dump.  We do have a mismatch between DI_TREE and 
> TDF_GIMPLE though -- is there something sensible we could do there?
> 
>> #define TDF_GIMPLE
>> #define TDF_GIMPLE_FE - GIMPLE front-end

As we have couple of dump flags used just for GIMPLE dumps, my idea was to give 
them common predecessor (TDF_GIMPLE).
Which explains why current TDF_GIMPLE (GIMPLE FE) needs to be renamed.
> 
> How might this differ from a new  -fdump-lang-original?  I.e.
> (1) why is it a dump-modifier flag, rather than a dump in its own right

Because you can use it for all gimple/tree passes to produce input for GIMPLE 
FE.

> (2) if we do need it, name it TDF_GIMPLE_LANG

Can be done that.

> 
>> #define TDF_GIMPLE_VOPS
>> #define TDF_GIMPLE_EH
>> #define TDF_GIMPLE_ALIAS
>> #define TDF_GIMPLE_SCEV
>> #define TDF_GIMPLE_MEMSYMS
>> #define TDF_GIMPLE_RHS_ONLY
>>
>> #define TDF_RTL
> 
> How does this differ from the current TDF_RTL meaning?  Is it implying 
> 'TDF_RTL_ALL'? (same question about TDF_GIMPLE).

Yes, -fdump-tree-xyz-rtl would be equal to -fdump-tree-xyz-rtl-all.

Martin

> 
>> #define TDF_RTL_CSELIB
> 
> nathan



Re: [PATCH] Fix VAR_DECL w/o a BIND_EXPR (PR sanitize/80659).

2017-05-15 Thread Richard Biener
On Mon, May 15, 2017 at 12:17 PM, Martin Liška  wrote:
> Hello.
>
> There are situations where local variables (defined in a switch scope) do
> not belong to any BIND_EXPR. Thus, we ICE due to gcc_assert 
> (gimplify_ctxp->live_switch_vars->elements () == 0);
>
> Is there any better solution how we can catch these variables?
>
> Suggested patch can bootstrap on ppc64le-redhat-linux and survives regression 
> tests.
>
> Ready to be installed?

I think the C FE and/or ASAN should be fixed instead.  Seems to work
fine with C++.

Richard.

> Martin


Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Martin Liška
On 05/15/2017 01:18 PM, Nathan Sidwell wrote:
> On 05/15/2017 05:39 AM, Martin Liška wrote:
>> Thanks Martin for feedback! After I spent quite some time with fiddling with
>> the options, I'm not convinced we should convert options to more hierarchical
> 
> I'll respond to Martin's email properly separates, but while we're on dump 
> redesign, here's a WIP patch I whipped up on Friday for the modules branch.  
> This tries to move the language-specific options to a dynamic registration 
> mechanism, rather than hard wire them into dumpfile.[hc]. There were some 
> awkward pieces due to the current structure of dumpfile registration.
> 
> I took the -fdump-class-heirachy and -fdump-translation-unit and turned them 
> into -fdump-lang-class and -fdump-lang-translation.  Unfortunately the 
> current LANG_HOOKS_INIT_OPTIONS is run rather too late to register these 
> dumps so that they get nice low numbers.  That's because 
> gcc::context::context () both creates the dump manager and then immediately 
> creates all the optimization passes:
>   m_dumps = new gcc::dump_manager ();
>   /* Allow languages to register dumps before passes.  */
>   lang_hooks.register_dumps (m_dumps);
>   m_passes = new gcc::pass_manager (this);
> 
> As you can see i wedged a new lang hook between.  That's a little unpleasant 
> -- perhaps
>   m_passes = new gcc::pass_manager (this);
> should be done later? Or the lang dumps could be unnumbered -- it is jarring 
> for them to be numbered as-if succeeding the optimization passes.
> 
> (I passed m_dumps as a void * purely to avoid header file jiggery pokery at 
> this step)
> 
> This patch does allow removing special class_dump_file handling from 
> c-family/c-opts.c, which is nice.  It looks like -mdump-tree-original might 
> be another candidate for dynamic registration?
> 
> thoughts?

Hello.

I like the idea and I believe we should do the same with single use (in a 
particular pass) dump files like:

  dump_file_info (".cgraph", "ipa-cgraph", DK_ipa, 0),
  dump_file_info (".type-inheritance", "ipa-type-inheritance", DK_ipa, 0),
  dump_file_info (".ipa-clones", "ipa-clones", DK_ipa, 0),
  dump_file_info (".tu", "translation-unit", DK_lang, 1),
  dump_file_info (".class", "class-hierarchy", DK_lang, 2),
  dump_file_info (".original", "tree-original", DK_tree, 3),
  dump_file_info (".gimple", "tree-gimple", DK_tree, 4),
  dump_file_info (".nested", "tree-nested", DK_tree, 5),

Martin

> 
> nathan
> 



Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Nathan Sidwell

On 05/15/2017 08:04 AM, Martin Liška wrote:

On 05/15/2017 01:33 PM, Nathan Sidwell wrote:



1) The TDF_UID and TDF_NOUID options seem to be inverses of each other. Can't 
we just ditch the latter?


One is used to paper over UIDs in order to preserve -fdebug-compare (I 
believe). And the second one
is used to dump UIDd basically for *_DECL. As any of these is not default, both 
make sense.


Might I suggest we rename at least one of them then?


How does this differ from the current TDF_RTL meaning?  Is it implying 
'TDF_RTL_ALL'? (same question about TDF_GIMPLE).


Yes, -fdump-tree-xyz-rtl would be equal to -fdump-tree-xyz-rtl-all.


I wonder if we can name things to be a little clearer?  Here you're 
applying an rtl modifier to a tree dump.  I find that jarring, given we 
have rtl dumps themselves. (I don't have a good suggestion right now).


Given a blank sheet of paper, the current 'TDF_tree' dumps should really 
be 'TDF_gimple' dumps, so we'd have lang/ipa/gimple/rtl kinds of dumps. 
Such a renaming may be an unacceptable amount of churn though.


nathan

--
Nathan Sidwell


[PATCH] Two DW_OP_GNU_variable_value fixes / tweaks

2017-05-15 Thread Richard Biener

While bringing early LTO debug up-to-speed I noticed the following two
issues.  The first patch avoids useless work in 
note_variable_value_in_expr and actually makes use of the DIE we
eventually create in resolve_variable_value_in_expr.

The second avoids generating a DW_OP_GNU_variable_value for a decl
we'll never have a DIE for -- this might be less obvious than the
other patch but I think we'll also never have any other debug info
we can resolve the decl with(?)

Bootstrapped and tested the first patch sofar
on x86_64-unknown-linux-gnu, 2nd is pending.

Ok?

Thanks,
Richard.

2017-05-15  Richard Biener  

* dwarf2out.c (resolve_variable_value_in_expr): Lookup DIE
just generated.
(note_variable_value_in_expr): If we resolved the decl ref
do not push to the stack.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 248055)
+++ gcc/dwarf2out.c (working copy)
@@ -30109,8 +30109,9 @@ resolve_variable_value_in_expr (dw_attr_
  break;
}
  /* Create DW_TAG_variable that we can refer to.  */
- ref = gen_decl_die (decl, NULL_TREE, NULL,
- lookup_decl_die (current_function_decl));
+ gen_decl_die (decl, NULL_TREE, NULL,
+   lookup_decl_die (current_function_decl));
+ ref = lookup_decl_die (decl);
  if (ref)
{
  loc->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
@@ -30203,6 +30204,7 @@ note_variable_value_in_expr (dw_die_ref
loc->dw_loc_oprnd1.val_class = dw_val_class_die_ref;
loc->dw_loc_oprnd1.v.val_die_ref.die = ref;
loc->dw_loc_oprnd1.v.val_die_ref.external = 0;
+   continue;
  }
if (VAR_P (decl)
&& DECL_CONTEXT (decl)


2017-05-15  Richard Biener  

* dwarf2out.c (loc_list_from_tree_1): Do not create
DW_OP_GNU_variable_value for DECL_IGNORED_P decls.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 248054)
+++ gcc/dwarf2out.c (working copy)
@@ -17373,6 +17685,7 @@ loc_list_from_tree_1 (tree loc, int want
&& early_dwarf
&& current_function_decl
&& want_address != 1
+   && ! DECL_IGNORED_P (loc)
&& (INTEGRAL_TYPE_P (TREE_TYPE (loc))
|| POINTER_TYPE_P (TREE_TYPE (loc)))
&& DECL_CONTEXT (loc) == current_function_decl



New Swedish PO file for 'gcc' (version 7.1.0)

2017-05-15 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-7.1.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Martin Liška
On 05/15/2017 02:24 PM, Nathan Sidwell wrote:
> On 05/15/2017 08:04 AM, Martin Liška wrote:
>> On 05/15/2017 01:33 PM, Nathan Sidwell wrote:
> 
>>> 1) The TDF_UID and TDF_NOUID options seem to be inverses of each other. 
>>> Can't we just ditch the latter?
>>
>> One is used to paper over UIDs in order to preserve -fdebug-compare (I 
>> believe). And the second one
>> is used to dump UIDd basically for *_DECL. As any of these is not default, 
>> both make sense.
> 
> Might I suggest we rename at least one of them then?

Yep, it's doable.

> 
>>> How does this differ from the current TDF_RTL meaning?  Is it implying 
>>> 'TDF_RTL_ALL'? (same question about TDF_GIMPLE).
>>
>> Yes, -fdump-tree-xyz-rtl would be equal to -fdump-tree-xyz-rtl-all.
> 
> I wonder if we can name things to be a little clearer?  Here you're applying 
> an rtl modifier to a tree dump.  I find that jarring, given we have rtl dumps 
> themselves. (I don't have a good suggestion right now).

Sorry, that's confusing example.

> 
> Given a blank sheet of paper, the current 'TDF_tree' dumps should really be 
> 'TDF_gimple' dumps, so we'd have lang/ipa/gimple/rtl kinds of dumps. Such a 
> renaming may be an unacceptable amount of churn though.

Well, I would prefer to introduce new enum for kind of dump:
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01033.html

and TDF_GIMPLE_* would really reflect to a dump suboption which is used by 
GIMPLE pretty printer related functions.

Martin

> 
> nathan
> 



Re: [PATCH] handling address mode changes inside extract_bit_field

2017-05-15 Thread Jeff Law

On 05/15/2017 05:46 AM, Joseph Myers wrote:

The extra argument to extract_bit_field breaks builds for tilegx-linux-gnu
and tilepro-linux-gnu (as shown by my glibc bot); there are calls in those
back ends which haven't been updated.


I've got patches for the tile backends that I'll push today.

jeff


[PATCH] Fix order and types of members in C++17 insert_return_type structs

2017-05-15 Thread Jonathan Wakely

Nico Josuttis pointed out that the members were in the wrong order. I
also found that set::insert__return_type::position was using
_Rb_tree_iterator not _Rb_tree_const_iterator as it should have
been.

Finally, with complete structured bindings support in the FE we don't
need the tuple_size and tuple_element partial specializations that I
defined for the insert_return_type structs.

PR libstdc++/80761
* include/bits/node_handle.h (_Node_insert_return): Reorder members.
(tuple_size, tuple_element): Remove partial specializations.
* include/bits/stl_tree.h (_Rb_tree::insert_return_type): Use
const_iterator for std::set.
* testsuite/23_containers/map/modifiers/extract.cc: New.
* testsuite/23_containers/set/modifiers/extract.cc: New.
* testsuite/23_containers/unordered_map/modifiers/extract.cc: New.
* testsuite/23_containers/unordered_set/modifiers/extract.cc: New.

Tested powerpc64le-linux, committed to trunk. Backport to gcc-7 to
follow.



commit 5ff45f6392071ed0fd7950f24147e2ada9bf058f
Author: Jonathan Wakely 
Date:   Mon May 15 12:17:51 2017 +0100

Fix order and types of members in C++17 insert_return_type structs

PR libstdc++/80761
* include/bits/node_handle.h (_Node_insert_return): Reorder members.
(tuple_size, tuple_element): Remove partial specializations.
* include/bits/stl_tree.h (_Rb_tree::insert_return_type): Use
const_iterator for std::set.
* testsuite/23_containers/map/modifiers/extract.cc: New.
* testsuite/23_containers/set/modifiers/extract.cc: New.
* testsuite/23_containers/unordered_map/modifiers/extract.cc: New.
* testsuite/23_containers/unordered_set/modifiers/extract.cc: New.

diff --git a/libstdc++-v3/include/bits/node_handle.h 
b/libstdc++-v3/include/bits/node_handle.h
index 44a9264..c7694a1 100644
--- a/libstdc++-v3/include/bits/node_handle.h
+++ b/libstdc++-v3/include/bits/node_handle.h
@@ -280,8 +280,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct _Node_insert_return
 {
-  bool inserted = false;
   _Iteratorposition = _Iterator();
+  bool inserted = false;
   _NodeHandle  node;
 
   template
@@ -305,22 +305,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 };
 
-  template
-struct tuple_size<_Node_insert_return<_Iterator, _NodeHandle>>
-: integral_constant { };
-
-  template
-struct tuple_element<0, _Node_insert_return<_Iterator, _NodeHandle>>
-{ using type = bool; };
-
-  template
-struct tuple_element<1, _Node_insert_return<_Iterator, _NodeHandle>>
-{ using type = _Iterator; };
-
-  template
-struct tuple_element<2, _Node_insert_return<_Iterator, _NodeHandle>>
-{ using type = _NodeHandle; };
-
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index aedee06..3f133b0 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -812,7 +812,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201402L
   using node_type = _Node_handle<_Key, _Val, _Node_allocator>;
-  using insert_return_type = _Node_insert_return;
+  using insert_return_type = _Node_insert_return<
+   conditional_t, const_iterator, iterator>,
+   node_type>;
 #endif
 
   pair<_Base_ptr, _Base_ptr>
diff --git a/libstdc++-v3/testsuite/23_containers/map/modifiers/extract.cc 
b/libstdc++-v3/testsuite/23_containers/map/modifiers/extract.cc
index 507a708..80eaf01 100644
--- a/libstdc++-v3/testsuite/23_containers/map/modifiers/extract.cc
+++ b/libstdc++-v3/testsuite/23_containers/map/modifiers/extract.cc
@@ -135,6 +135,17 @@ test03()
   static_assert( is_same_v );
 }
 
+void
+test04()
+{
+  // Check order of members in insert_return_type
+  auto [pos, ins, node] = test_type::insert_return_type{};
+  using std::is_same_v;
+  static_assert( is_same_v );
+  static_assert( is_same_v );
+  static_assert( is_same_v );
+}
+
 int
 main()
 {
diff --git a/libstdc++-v3/testsuite/23_containers/set/modifiers/extract.cc 
b/libstdc++-v3/testsuite/23_containers/set/modifiers/extract.cc
index c56767a..3fbc6b9 100644
--- a/libstdc++-v3/testsuite/23_containers/set/modifiers/extract.cc
+++ b/libstdc++-v3/testsuite/23_containers/set/modifiers/extract.cc
@@ -126,6 +126,17 @@ test03()
   static_assert( is_same_v );
 }
 
+void
+test04()
+{
+  // Check order of members in insert_return_type
+  auto [pos, ins, node] = test_type::insert_return_type{};
+  using std::is_same_v;
+  static_assert( is_same_v );
+  static_assert( is_same_v );
+  static_assert( is_same_v );
+}
+
 int
 main()
 {
diff --git 
a/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/extract.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/extract.cc
index ad87c70..ce50766 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_map/

Re: [patch] build xz (instead of bz2) compressed tarballs and diffs

2017-05-15 Thread Joseph Myers
The xz manpage warns against blindly using -9 (for which --best is a 
deprecated alias) because of the implications for memory requirements for 
decompressing.  If there's a reason it's considered appropriate here, I 
think it needs an explanatory comment.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH,AIX] Enable libiberty to read AIX XCOFF

2017-05-15 Thread REIX, Tony
Description:
 * This patch enables libiberty to read AIX XCOFF.

Tests:
 * Fedora25/x86_64 + GCC v7.1.0 : Configure/Build: SUCCESS
   - build made by means of a .spec file based on Fedora gcc-7.0.1-0.12 .spec 
file
 ../configure --enable-bootstrap 
--enable-languages=c,c++,objc,obj-c++,fortran,go,lto --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
--enable-threads=posix --enable-checking=release
 --enable-multilib --with-system-zlib --enable-__cxa_atexit 
--disable-libunwind-exceptions --enable-gnu-unique-object 
--enable-linker-build-id --with-gcc-major-version-only 
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl 
--enable-libmpx
 --enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 
--build=x86_64-redhat-linux

ChangeLog:
  * libiberty/simple-object-xcoff.c: Enable libiberty to read AIX XCOFF


Regards,

Tony Reix
Bull - ATOS
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net
--- ./libiberty/simple-object-xcoff.c.ORIGIN2017-03-21 17:08:59 -0500
+++ ./libiberty/simple-object-xcoff.c   2017-03-21 16:45:43 -0500
@@ -258,6 +258,8 @@
 #define C_STAT (3)
 #define C_FILE (103)
 
+#define DBXMASK0x80
+
 /* Private data for an simple_object_read.  */
 
 struct simple_object_xcoff_read
@@ -403,7 +405,9 @@
   unsigned int nscns;
   char *strtab;
   size_t strtab_size;
+  struct external_syment *symtab = NULL;
   unsigned int i;
+  off_t textptr = 0;
 
   scnhdr_size = u64 ? SCNHSZ64 : SCNHSZ32;
   scnbuf = XNEWVEC (unsigned char, scnhdr_size * ocr->nscns);
@@ -485,10 +489,116 @@
  u.xcoff32.s_size));
}
 
+  if (strcmp (name, ".text") == 0)
+textptr = scnptr;
   if (!(*pfn) (data, name, scnptr, size))
break;
 }
 
+  /* Special handling for .go_export CSECT. */
+  if (textptr != 0 && ocr->nsyms > 0)
+{
+  unsigned char *sym, *aux;
+  const char *n_name;
+  unsigned long n_value, n_offset, n_zeroes, x_scnlen;
+
+  /* Read symbol table. */
+  symtab = XNEWVEC (struct external_syment, ocr->nsyms * SYMESZ);
+  if (!simple_object_internal_read (sobj->descriptor,
+sobj->offset + ocr->symptr,
+(unsigned char *)symtab,
+ocr->nsyms * SYMESZ,
+&errmsg, err))
+{
+  XDELETEVEC (symtab);
+ XDELETEVEC (scnbuf);
+  return NULL;
+}
+  /* Search in symbol table if we have a ".go_export" symbol. */
+  for (i = 0; i < ocr->nsyms; ++i)
+{
+  sym = (unsigned char *)&symtab[i];
+
+  if (symtab[i].n_sclass[0] & DBXMASK)
+{
+  /* Skip debug symbols whose names are in stabs. */
+  i += symtab[i].n_numaux[0];
+  continue;
+}
+  if (u64)
+{
+  n_value = fetch_64 (sym + offsetof (struct external_syment,
+  u.xcoff64.n_value));
+  n_offset = fetch_32 (sym + offsetof (struct external_syment,
+   u.xcoff64.n_offset));
+}
+  else
+{
+  /* ".go_export" is longer than N_SYMNMLEN */
+  n_zeroes = fetch_32 (sym + offsetof (struct external_syment,
+   u.xcoff32.n.n.n_zeroes));
+  if (n_zeroes != 0)
+{
+  /* Skip auxiliary entries. */
+  i += symtab[i].n_numaux[0];
+  continue;
+}
+  n_value = fetch_32 (sym + offsetof (struct external_syment,
+  u.xcoff32.n_value));
+  n_offset = fetch_32 (sym + offsetof (struct external_syment,
+   u.xcoff32.n.n.n_offset));
+}
+ /* The real section name is found in the string
+table.  */
+ if (strtab == NULL)
+   {
+ strtab = simple_object_xcoff_read_strtab (sobj,
+   &strtab_size,
+   &errmsg, err);
+ if (strtab == NULL)
+   {
+  XDELETEVEC (symtab);
+ XDELETEVEC (scnbuf);
+ return errmsg;
+   }
+   }
+
+ if (n_offset >= strtab_size)
+{
+ XDELETEVEC (strtab);
+ XDELETEVEC (symtab);
+ XDELETEVEC (scnbuf);
+ *err = 0;
+ return "section string ind

Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Nathan Sidwell

On 05/15/2017 09:06 AM, Martin Liška wrote:


Given a blank sheet of paper, the current 'TDF_tree' dumps should really be 
'TDF_gimple' dumps, so we'd have lang/ipa/gimple/rtl kinds of dumps. Such a 
renaming may be an unacceptable amount of churn though.


Well, I would prefer to introduce new enum for kind of dump:
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01033.html


Right, I understand that.  My point is that it might be confusing to 
users of the dump machinery (i.e. me), at the command-line level where 
'rtl' means different things in different contexts.  And we have 'tree' 
dumps that dump gimple and 'lang' dumps that also (can) dump trees.


We have a bunch of gimple optimization passes, but call the dumpers 
'tree'.  I know how we ended up here, but it seems confusing.


nathan

--
Nathan Sidwell


Re: [patch] build xz (instead of bz2) compressed tarballs and diffs

2017-05-15 Thread Markus Trippelsdorf
On 2017.05.15 at 14:02 +, Joseph Myers wrote:
> The xz manpage warns against blindly using -9 (for which --best is a 
> deprecated alias) because of the implications for memory requirements for 
> decompressing.  If there's a reason it's considered appropriate here, I 
> think it needs an explanatory comment.

I think it is unacceptable, because it would increase memory usage when
decompressing over 20x compared to bz2 (and over 100x while compressing).

The default -6 should be good enough (3x more memory when decompressing).

-- 
Markus


Re: C PATCH to kill c_save_expr or towards delayed folding for the C FE

2017-05-15 Thread Marek Polacek
On Fri, May 12, 2017 at 08:28:38PM +, Joseph Myers wrote:
> On Fri, 12 May 2017, Marek Polacek wrote:
> 
> > In the effort of reducing early folding, we should avoid calling 
> > c_fully_fold
> > blithely, except when needed for e.g. initializers.  This is a teeny tiny 
> > step
> 
> Note there are several reasons for early folding in the C front end: at 
> least (a) cases where logically needed (initializers and other places 
> where constants are needed), (b) because warnings need a folded 
> expression, (c) when the expression will go somewhere c_fully_fold does 
> not recurse inside.  Also (d) convert, at least, folds regardless of 
> whether it's actually necessary.
> 
> There is a case for avoiding (b) by putting the necessary information in 
> the IR so the warnings can happen later from c_fully_fold, though there 
> may be other possible approaches.
> 
> > @@ -146,8 +140,7 @@ convert (tree type, tree expr)
> >  
> >  case COMPLEX_TYPE:
> >/* If converting from COMPLEX_TYPE to a different COMPLEX_TYPE
> > -and e is not COMPLEX_EXPR, convert_to_complex uses save_expr,
> > -but for the C FE c_save_expr needs to be called instead.  */
> > +and E is not COMPLEX_EXPR, convert_to_complex uses save_expr.  */
> >if (TREE_CODE (TREE_TYPE (e)) == COMPLEX_TYPE)
> > {
> >   if (TREE_CODE (e) != COMPLEX_EXPR)
> 
> The point of this comment is to explain why we don't just call 
> convert_to_complex here (see PR 47150).  So with your changes it would 
> seem appropriate to change c-convert.c back to calling convert_to_complex 
> here.
 
Thanks for pointing this out!  The new version:

Bootstrapped/regtested on x86_64-linux.

2017-05-15  Marek Polacek  

* c-common.c (c_save_expr): Remove.
(c_common_truthvalue_conversion): Remove a call to c_save_expr.
* c-common.h (c_save_expr): Remove declaration.

* c-convert.c (convert): Replace c_save_expr with save_expr.  Don't
call c_fully_fold.
(convert) : Remove special handling of COMPLEX_TYPEs.
* c-decl.c (grokdeclarator): Replace c_save_expr with save_expr. 
* c-fold.c (c_fully_fold_internal): Handle SAVE_EXPR.
* c-parser.c (c_parser_declaration_or_fndef): Replace c_save_expr with
save_expr.
(c_parser_conditional_expression): Likewise.
* c-tree.h (SAVE_EXPR_FOLDED_P): Define.
* c-typeck.c (build_modify_expr): Replace c_save_expr with save_expr.
(process_init_element): Likewise.
(build_binary_op): Likewise.
(handle_omp_array_sections_1): Likewise.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index ad686d2..f606e94 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -3164,24 +3164,6 @@ c_wrap_maybe_const (tree expr, bool non_const)
   return expr;
 }
 
-/* Wrap a SAVE_EXPR around EXPR, if appropriate.  Like save_expr, but
-   for C folds the inside expression and wraps a C_MAYBE_CONST_EXPR
-   around the SAVE_EXPR if needed so that c_fully_fold does not need
-   to look inside SAVE_EXPRs.  */
-
-tree
-c_save_expr (tree expr)
-{
-  bool maybe_const = true;
-  if (c_dialect_cxx ())
-return save_expr (expr);
-  expr = c_fully_fold (expr, false, &maybe_const);
-  expr = save_expr (expr);
-  if (!maybe_const)
-expr = c_wrap_maybe_const (expr, true);
-  return expr;
-}
-
 /* Return whether EXPR is a declaration whose address can never be
NULL.  */
 
@@ -3436,7 +3418,7 @@ c_common_truthvalue_conversion (location_t location, tree 
expr)
 
   if (TREE_CODE (TREE_TYPE (expr)) == COMPLEX_TYPE)
 {
-  tree t = (in_late_binary_op ? save_expr (expr) : c_save_expr (expr));
+  tree t = save_expr (expr);
   expr = (build_binary_op
  (EXPR_LOCATION (expr),
   (TREE_SIDE_EFFECTS (expr)
diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 9e3982d..3981544 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -836,7 +836,6 @@ extern enum conversion_safety unsafe_conversion_p 
(location_t, tree, tree,
 extern bool decl_with_nonnull_addr_p (const_tree);
 extern tree c_fully_fold (tree, bool, bool *);
 extern tree c_wrap_maybe_const (tree, bool);
-extern tree c_save_expr (tree);
 extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
 extern tree c_sizeof_or_alignof_type (location_t, tree, bool, bool, int);
diff --git gcc/c/c-convert.c gcc/c/c-convert.c
index 163feff..b8117b4 100644
--- gcc/c/c-convert.c
+++ gcc/c/c-convert.c
@@ -111,13 +111,7 @@ convert (tree type, tree expr)
  && COMPLETE_TYPE_P (type)
  && do_ubsan_in_current_function ())
{
- if (in_late_binary_op)
-   expr = save_expr (expr);
- else
-   {
- expr = c_save_expr (expr);
- expr = c_fully_fold (expr, false, NULL);
-   }
+ expr = save_expr (expr);
  tree check = ubsan_instrument_float_cast (loc, ty

Re: [RFC] Do we want hierarchical options & encapsulation in a class

2017-05-15 Thread Martin Liška
On 05/15/2017 04:12 PM, Nathan Sidwell wrote:
> On 05/15/2017 09:06 AM, Martin Liška wrote:
> 
>>> Given a blank sheet of paper, the current 'TDF_tree' dumps should really be 
>>> 'TDF_gimple' dumps, so we'd have lang/ipa/gimple/rtl kinds of dumps. Such a 
>>> renaming may be an unacceptable amount of churn though.
>>
>> Well, I would prefer to introduce new enum for kind of dump:
>> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01033.html
> 
> Right, I understand that.  My point is that it might be confusing to users of 
> the dump machinery (i.e. me), at the command-line level where 'rtl' means 
> different things in different contexts.  And we have 'tree' dumps that dump 
> gimple and 'lang' dumps that also (can) dump trees.

Right. To be honest, originally I was convinced about positive impact of 
hierarchical options. But changing names of dump suboptions will bring
inconvenience for current developers of GCC (who mainly use it). And I also 
noticed that one can write -fdump-tree-ifcvt-stats-blocks-details,
a combination of multiple suboptions. Which makes it even more complex :)

That said, I'm not inclining to that. Then it's questionable whether to 
encapsulate masking enum to a class?

Martin

> 
> We have a bunch of gimple optimization passes, but call the dumpers 'tree'.  
> I know how we ended up here, but it seems confusing.
> 
> nathan
> 



[PATCH,AIX] Enable XCOFF in libbacktrace on AIX

2017-05-15 Thread REIX, Tony
Description:
 * This patch enables libbacktrace to handle XCOFF on AIX.

Tests:
 * Fedora25/x86_64 + GCC v7.1.0 : Configure/Build: SUCCESS
   - build made by means of a .spec file based on Fedora gcc-7.0.1-0.12 .spec 
file
 ../configure --enable-bootstrap 
--enable-languages=c,c++,objc,obj-c++,fortran,go,lto --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
--enable-threads=posix --enable-checking=release --enable-multilib 
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-linker-build-id 
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin 
--enable-initfini-array --with-isl --enable-libmpx 
--enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 
--build=x86_64-redhat-linux

ChangeLog:
  * libbacktrace/Makefile.am : Add xcoff.c
  * libbacktrace/Makefile.in : Regenerated
  * libbacktrace/configure.ac : Add XCOFF output file type
  * libbacktrace/configure : Regenerated
  * libbacktrace/fileline.c : Handle AIX procfs tree
  * libbacktrace/filetype.awk : Add AIX XCOFF type detection
  * libbacktrace/xcoff.c : New file for handling XCOFF format

Regards,

Tony Reix
Bull - ATOS
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.netdiff -Nur gcc-7-20170203.orig/libbacktrace/Makefile.am 
gcc-7-20170203/libbacktrace/Makefile.am
--- gcc-7-20170203.orig/libbacktrace/Makefile.am2017-01-02 01:19:31 
-0600
+++ gcc-7-20170203/libbacktrace/Makefile.am 2017-03-22 14:09:40 -0500
@@ -57,7 +57,8 @@
 FORMAT_FILES = \
elf.c \
pecoff.c \
-   unknown.c
+   unknown.c \
+   xcoff.c
 
 VIEW_FILES = \
read.c \
@@ -134,3 +135,5 @@
 stest.lo: config.h backtrace.h internal.h
 state.lo: config.h backtrace.h backtrace-supported.h internal.h
 unknown.lo: config.h backtrace.h internal.h
+xcoff.lo: config.h backtrace.h internal.h
+
diff -Nur gcc-7-20170203.orig/libbacktrace/Makefile.in 
gcc-7-20170203/libbacktrace/Makefile.in
--- gcc-7-20170203.orig/libbacktrace/Makefile.in2016-11-16 16:36:10 
-0600
+++ gcc-7-20170203/libbacktrace/Makefile.in 2017-03-22 14:06:51 -0500
@@ -301,7 +301,8 @@
 FORMAT_FILES = \
elf.c \
pecoff.c \
-   unknown.c
+   unknown.c \
+   xcoff.c
 
 VIEW_FILES = \
read.c \
@@ -764,6 +765,7 @@
 stest.lo: config.h backtrace.h internal.h
 state.lo: config.h backtrace.h backtrace-supported.h internal.h
 unknown.lo: config.h backtrace.h internal.h
+xcoff.lo: config.h backtrace.h internal.h
 
 # Tell versions [3.59,3.63) of GNU make to not export all variables.
 # Otherwise a system limit (for SysV at least) may be exceeded.
diff -Nur gcc-7-20170203.orig/libbacktrace/configure 
gcc-7-20170203/libbacktrace/configure
--- gcc-7-20170203.orig/libbacktrace/configure  2016-11-16 16:36:13 -0600
+++ gcc-7-20170203/libbacktrace/configure   2017-03-22 14:13:40 -0500
@@ -11844,6 +11844,9 @@
 pecoff) FORMAT_FILE="pecoff.lo"
 backtrace_supports_data=no
;;
+xcoff) FORMAT_FILE="xcoff.lo"
+   backtrace_supports_data=no
+   ;;
 *) { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: could not determine 
output file type" >&5
 $as_echo "$as_me: WARNING: could not determine output file type" >&2;}
FORMAT_FILE="unknown.lo"
diff -Nur gcc-7-20170203.orig/libbacktrace/configure.ac 
gcc-7-20170203/libbacktrace/configure.ac
--- gcc-7-20170203.orig/libbacktrace/configure.ac   2017-01-02 01:19:31 
-0600
+++ gcc-7-20170203/libbacktrace/configure.ac2017-03-22 13:59:23 -0500
@@ -231,6 +231,9 @@
 pecoff) FORMAT_FILE="pecoff.lo"
 backtrace_supports_data=no
;;
+xcoff) FORMAT_FILE="xcoff.lo"
+   backtrace_supports_data=no
+   ;;
 *) AC_MSG_WARN([could not determine output file type])
FORMAT_FILE="unknown.lo"
backtrace_supported=no
diff -Nur gcc-7-20170203.orig/libbacktrace/fileline.c 
gcc-7-20170203/libbacktrace/fileline.c
--- gcc-7-20170203.orig/libbacktrace/fileline.c 2017-01-02 01:19:54 -0600
+++ gcc-7-20170203/libbacktrace/fileline.c  2017-02-27 13:46:50 -0600
@@ -37,6 +37,9 @@
 #include 
 #include 
 #include 
+#ifdef _AIX
+#include  /* getpid */
+#endif
 
 #include "backtrace.h"
 #include "internal.h"
@@ -83,6 +86,9 @@
   for (pass = 0; pass < 4; ++pass)
 {
   const char *filename;
+#ifdef _AIX
+  char buf[64];
+#endif
   int does_not_exist;
 
   switch (pass)
@@ -94,7 +100,12 @@
  filename = getexecname ();
  break;
case 2:
+#ifdef _AIX
+ snprintf(buf, sizeof(buf), "/proc/%d/object/a.out", getpid());
+ filename = buf;
+#else
  filename = "/proc/self/exe";
+#endif
  break;
case 3:
  filename = "/proc/curproc/file";
diff -Nur gcc-7-20170203.orig/libbacktrace/filetype.awk 
gcc-7-20170203/libbac

Re: C PATCH to kill c_save_expr or towards delayed folding for the C FE

2017-05-15 Thread Joseph Myers
On Mon, 15 May 2017, Marek Polacek wrote:

> Thanks for pointing this out!  The new version:
> 
> Bootstrapped/regtested on x86_64-linux.
> 
> 2017-05-15  Marek Polacek  
> 
>   * c-common.c (c_save_expr): Remove.
>   (c_common_truthvalue_conversion): Remove a call to c_save_expr.
>   * c-common.h (c_save_expr): Remove declaration.
> 
>   * c-convert.c (convert): Replace c_save_expr with save_expr.  Don't
>   call c_fully_fold.
>   (convert) : Remove special handling of COMPLEX_TYPEs.
>   * c-decl.c (grokdeclarator): Replace c_save_expr with save_expr. 
>   * c-fold.c (c_fully_fold_internal): Handle SAVE_EXPR.
>   * c-parser.c (c_parser_declaration_or_fndef): Replace c_save_expr with
>   save_expr.
>   (c_parser_conditional_expression): Likewise.
>   * c-tree.h (SAVE_EXPR_FOLDED_P): Define.
>   * c-typeck.c (build_modify_expr): Replace c_save_expr with save_expr.
>   (process_init_element): Likewise.
>   (build_binary_op): Likewise.
>   (handle_omp_array_sections_1): Likewise.

This is OK (given the save_expr folding change it depends on).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH,AIX] Enable FFI Go Closure on AIX

2017-05-15 Thread REIX, Tony
Description:
 * This patch enables FFI Go Closure on AIX.

Tests:
 * Fedora25/x86_64 + GCC v7.1.0 : Configure/Build: SUCCESS
   - build made by means of a .spec file based on Fedora gcc-7.0.1-0.12 .spec 
file
  ../configure --enable-bootstrap 
--enable-languages=c,c++,objc,obj-c++,fortran,go,lto --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
--enable-threads=posix --enable-checking=release --enable-multilib 
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-linker-build-id 
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin 
--enable-initfini-array --with-isl --enable-libmpx 
--enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 
--build=x86_64-redhat-linux

ChangeLog:
  * libffi/src/powerpc/aix.S : Implements Go Closure on AIX.
  * libffi/src/powerpc/aix_closure.S : Idem.
  * libffi/src/powerpc/ffi_darwin.c : Idem.
  * libffi/src/powerpc/ffitarget.h : Enables Go Closure on AIX.

Regards

Tony Reix
Bull - ATOS
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net






--- ./libffi/src/powerpc/aix.S  2016-11-16 22:25:34 -0600
+++ ./libffi/src/powerpc/aix.S  2017-04-24 13:58:19 -0500
@@ -106,6 +106,10 @@
.llong .ffi_call_AIX, TOC[tc0], 0
.csect .text[PR]
 .ffi_call_AIX:
+   .function .ffi_call_AIX,.ffi_call_AIX,16,044,LFE..0-LFB..0
+   .bf __LINE__
+   .line 1
+LFB..0:
/* Save registers we use.  */
mflrr0
 
@@ -115,8 +119,10 @@
std r31, -8(r1)
 
std r0, 16(r1)
+LCFI..0:
mr  r28, r1 /* our AP.  */
stdux   r1, r1, r4
+LCFI..1:
 
/* Save arguments over call...  */
mr  r31, r5 /* flags, */
@@ -202,12 +208,16 @@
 L(float_return_value):
stfsf1, 0(r30)
b   L(done_return_value)
-
+LFE..0:
 #else /* ! __64BIT__ */

.long .ffi_call_AIX, TOC[tc0], 0
.csect .text[PR]
 .ffi_call_AIX:
+   .function .ffi_call_AIX,.ffi_call_AIX,16,044,LFE..0-LFB..0
+   .bf __LINE__
+   .line 1
+LFB..0:
/* Save registers we use.  */
mflrr0
 
@@ -217,8 +227,10 @@
stw r31, -4(r1)
 
stw r0, 8(r1)
+LCFI..0:
mr  r28, r1 /* out AP.  */
stwux   r1, r1, r4
+LCFI..1:
 
/* Save arguments over call...  */
mr  r31, r5 /* flags, */
@@ -304,11 +316,144 @@
 L(float_return_value):
stfsf1, 0(r30)
b   L(done_return_value)
+LFE..0:
 #endif
+   .ef __LINE__
.long 0
.byte 0,0,0,1,128,4,0,0
 /* END(ffi_call_AIX) */
 
+   /* void ffi_call_go_AIX(extended_cif *ecif, unsigned long bytes,
+*  unsigned int flags, unsigned int *rvalue,
+*  void (*fn)(),
+*  void (*prep_args)(extended_cif*, unsigned 
*const),
+*  void *closure);
+* r3=ecif, r4=bytes, r5=flags, r6=rvalue, r7=fn, r8=prep_args, 
r9=closure
+*/
+
+.csect .text[PR]
+   .align 2
+   .globl ffi_call_go_AIX
+   .globl .ffi_call_go_AIX
+.csect ffi_call_go_AIX[DS]
+ffi_call_go_AIX:
+#ifdef __64BIT__
+   .llong .ffi_call_go_AIX, TOC[tc0], 0
+   .csect .text[PR]
+.ffi_call_go_AIX:
+   .function .ffi_call_go_AIX,.ffi_call_go_AIX,16,044,LFE..1-LFB..1
+   .bf __LINE__
+   .line 1
+LFB..1:
+   /* Save registers we use.  */
+   mflrr0
+
+   std r28,-32(r1)
+   std r29,-24(r1)
+   std r30,-16(r1)
+   std r31, -8(r1)
+
+   std r9, 8(r1)   /* closure, saved in cr field. */
+   std r0, 16(r1)
+LCFI..2:
+   mr  r28, r1 /* our AP.  */
+   stdux   r1, r1, r4
+LCFI..3:
+
+   /* Save arguments over call...  */
+   mr  r31, r5 /* flags, */
+   mr  r30, r6 /* rvalue, */
+   mr  r29, r7 /* function address,  */
+   std r2, 40(r1)
+
+   /* Call ffi_prep_args.  */
+   mr  r4, r1
+   bl  .ffi_prep_args
+   nop
+
+   /* Now do the call.  */
+   ld  r0, 0(r29)
+   ld  r2, 8(r29)
+   ld  r11, 8(r28) /* closure */
+   /* Set up cr1 with bits 4-7 of the flags.  */
+   mtcrf   0x40, r31
+   mtctr   r0
+   /* Load all those argument registers.  */
+   /* We have set up a nice stack frame, just load it into registers. */
+   ld  r3, 40+(1*8)(r1)
+   ld  r4, 40+(2*8)(r1)
+   ld  r5, 40+(3*8)(r1)
+   ld  r6, 40+(4*8)(r1)
+   nop
+   ld  r7, 40+(5*8)(r1)
+   ld  r8, 40+(6*8)(r1)
+   ld  r9, 40+(7*8)(r1)
+   ld  r10,40+(8*8)(r1)
+
+   b   L1
+LFE..1:
+#else /* ! __64BIT__ */
+   
+   .lo

[PATCH,AIX] Enable Stack Unwinding on AIX

2017-05-15 Thread REIX, Tony
Description:
 * This patch enables the stack unwinding on AIX.

Tests:
 * Fedora25/x86_64 + GCC v7.1.0 : Configure/Build: SUCCESS
   - build made by means of a .spec file based on Fedora gcc-7.0.1-0.12 .spec 
file
 ../configure --enable-bootstrap 
--enable-languages=c,c++,objc,obj-c++,fortran,go,lto --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
--enable-threads=posix --enable-checking=release --enable-multilib 
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions 
--enable-gnu-unique-object --enable-linker-build-id 
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin 
--enable-initfini-array --with-isl --enable-libmpx 
--enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 
--build=x86_64-redhat-linux

ChangeLog:
  * libgcc/config/rs6000/aix-unwind.h : Implements stack unwinding on AIX.

Regards,

Tony Reix
Bull - ATOS
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net--- ./libgcc/config/rs6000/aix-unwind.h 2017-01-02 01:20:05 -0600
+++ ./libgcc/config/rs6000/aix-unwind.h 2017-04-28 10:03:16 -0500
@@ -64,7 +64,8 @@
 #endif
 
 /* Now on to MD_FALLBACK_FRAME_STATE_FOR.
-   32bit AIX 5.2, 5.3 and 7.1 only at this stage.  */
+   32bit AIX 5.2, 5.3, 6.1, 7.X and
+   64bit AIX 6.1, 7.X only at this stage.  */
 
 #include 
 #include 
@@ -73,10 +74,10 @@
 
 #ifdef __64BIT__
 
-/* 64bit fallback not implemented yet, so MD_FALLBACK_FRAME_STATE_FOR not
-   defined.  Arrange just for the code below to compile.  */
 typedef struct __context64 mstate_t;
 
+#define MD_FALLBACK_FRAME_STATE_FOR ppc_aix_fallback_frame_state
+
 #else
 
 typedef struct mstsave mstate_t;
@@ -128,10 +129,26 @@ ucontext_for (struct _Unwind_Context *co
 {
   const unsigned int * ra = context->ra;
 
-  /* AIX 5.2, 5.3 and 7.1, threaded or not, share common patterns
+  /* AIX 5.2, 5.3, 6.1 and 7.X, threaded or not, share common patterns
  and feature variants depending on the configured kernel (unix_mp
  or unix_64).  */
 
+#ifdef __64BIT__
+  if (*(ra - 5) == 0x4c00012c /* isync */
+  && *(ra - 4) == 0xe8ec  /* ld  r7,0(r12) */
+  && *(ra - 3) == 0xe84c0008  /* ld  r2,8(r12) */
+  && *(ra - 2) == 0x7ce903a6  /* mtctr   r7*/
+  && *(ra - 1) == 0x4e800421  /* bctrl */
+  && *(ra - 0) == 0x7de27b78) /* mr  r2,r15   <-- context->ra */
+{
+  /* unix_64 */
+  if (*(ra - 6) == 0x7d000164)  /* mtmsrd  r8 */
+{
+  /* AIX 6.1, 7.1 and 7.2 */
+  return (ucontext_t *)(context->cfa + 0x70);
+}
+}
+#else
   if (*(ra - 5) == 0x4c00012c /* isync */
   && *(ra - 4) == 0x80ec  /* lwz r7,0(r12) */
   && *(ra - 3) == 0x804c0004  /* lwz r2,4(r12) */
@@ -152,10 +169,14 @@ ucontext_for (struct _Unwind_Context *co
case 0x835a0570:  /* lwz r26,1392(r26) */
  return (ucontext_t *)(context->cfa + 0x40);
 
- /* AIX 7.1 */
+ /* AIX 6.1 and 7.1 */
case 0x2c1a:  /* cmpwi   r26,0 */
  return (ucontext_t *)(context->cfa + 0x40);
-   
+
+ /* AIX 7.2 */
+   case 0x380a:  /* li   r0,A */
+ return (ucontext_t *)(context->cfa + 0x40);
+
default:
  return 0;
}
@@ -174,7 +195,7 @@ ucontext_for (struct _Unwind_Context *co
  return &frame->ucontext;
}
 }
-
+#endif
   return 0;
 }
 


Re: [patch, libfortran] Fix amount of memory allocation for matrix - vector calculation

2017-05-15 Thread H.J. Lu
On Fri, May 12, 2017 at 9:57 AM, Thomas Koenig  wrote:
> Am 12.05.2017 um 10:16 schrieb Janne Blomqvist:
>>
>> On Fri, May 12, 2017 at 1:14 AM, Thomas Koenig 
>> wrote:
>>>
>>> Hello world,
>>>
>>> the memory allocation for the buffer in the library matmul
>>> routines still has one problem: The value of 0xdeadbeef meant
>>> as poison could end up in the calculation of the size of the
>>> buffer for the blocked matmul.
>>>
>>> The attached patch fixes that. Verified with regression-test,
>>> also by running a few select test cases under valgrind.
>>>
>>> No test case because nothing appeared to fail.
>>>
>>> OK for trunk?
>>
>>
>> Patch missing?
>
>
> Well, yes.
>
> Here it is.
>
> Regards
>
> Thomas
>

This fixes:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80765

-- 
H.J.


Re: [PATCH GCC8][29/33]New register pressure estimation

2017-05-15 Thread Bin.Cheng
On Thu, May 11, 2017 at 11:39 AM, Richard Biener
 wrote:
> On Tue, Apr 18, 2017 at 12:53 PM, Bin Cheng  wrote:
>> Hi,
>> Currently IVOPTs shares the same register pressure computation with RTL loop 
>> invariant pass,
>> which doesn't work very well.  This patch introduces specific interface for 
>> IVOPTs.
>> The general idea is described in the cover message as below:
>>   C) Current implementation shares the same register pressure computation 
>> with RTL loop
>>  inv pass.  It has difficulty in handling (especially large) loop nest, 
>> and quite
>>  often generating too many candidates (especially for outer loops).  
>> This change
>>  introduces new register pressure computation.  The brief idea is to 
>> differentiate
>>  (hot) innermost loop and outer loop.  for (possibly hot) inner most, 
>> more registers
>>  are allowed as long as the register pressure is within the range of 
>> number of target
>>  available registers.
>> It can also help to restrict number of candidates for outer loop.
>> Is it OK?
>
> +/* Determine if current loop is the innermost loop and maybe hot.  */
> +
> +static void
> +determine_hot_innermost_loop (struct ivopts_data *data)
> +{
> +  data->hot_innermost_loop_p = true;
> +  if (!data->speed)
> +return;
>
> err, so when not optimizing for speed we assume all loops (even not innermost)
> are hot and innermost?!
>
> +  HOST_WIDE_INT niter = avg_loop_niter (loop);
> +  if (niter < PARAM_VALUE (PARAM_AVG_LOOP_NITER)
> +  || loop_constraint_set_p (loop, LOOP_C_PROLOG)
> +  || loop_constraint_set_p (loop, LOOP_C_EPILOG)
> +  || loop_constraint_set_p (loop, LOOP_C_VERSION))
> +data->hot_innermost_loop_p = false;
>
> this needs adjustment for the constraint patch removal.  Also looking at niter
> of the loop in question insn't a good metric for hotness.  data->speed is the
> best guess you get I think (optimize_loop_for_speed_p).
>
>data->speed = optimize_loop_for_speed_p (loop);
> +  determine_hot_innermost_loop (data);
>
>   data->hot_innermost_loop_p = determine_hot_innermost_loop (data);
>
> would be more consistent here.
Hi,
I removed the hot innermost part and here is the updated version.  Is it OK?

Thanks,
bin

2017-05-11  Bin Cheng  

* tree-ssa-loop-ivopts.c (ivopts_estimate_reg_pressure): New
reg_pressure model function.
(ivopts_global_cost_for_size): Delete.
(determine_set_costs, iv_ca_recount_cost): Call new model function
ivopts_estimate_reg_pressure.

>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  
>>
>> * tree-ssa-loop-ivopts.c (struct ivopts_data): New field.
>> (ivopts_estimate_reg_pressure): New reg_pressure model function.
>> (ivopts_global_cost_for_size): Delete.
>> (determine_set_costs, iv_ca_recount_cost): Call new model function
>> ivopts_estimate_reg_pressure.
>> (determine_hot_innermost_loop): New.
>> (tree_ssa_iv_optimize_loop): Call above function.
From 3ca5cb6bafb516c68cad9d6fd3adbbe73bec4d19 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Fri, 10 Mar 2017 11:03:16 +
Subject: [PATCH 1/9] ivopt-reg_pressure-model-20170225.txt

---
 gcc/tree-ssa-loop-ivopts.c | 49 --
 1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 8b228ca..7caed10 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -5531,17 +5531,46 @@ determine_iv_costs (struct ivopts_data *data)
 fprintf (dump_file, "\n");
 }
 
-/* Calculates cost for having N_REGS registers.  This number includes
-   induction variables, invariant variables and invariant expressions.  */
+/* Estimate register pressure for loop having N_INVS invariants and N_CANDS
+   induction variables.  Note N_INVS includes both invariant variables and
+   invariant expressions.  */
 
 static unsigned
-ivopts_global_cost_for_size (struct ivopts_data *data, unsigned n_regs)
+ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs,
+			  unsigned n_cands)
 {
-  unsigned cost = estimate_reg_pressure_cost (n_regs,
-	  data->regs_used, data->speed,
-	  data->body_includes_call);
-  /* Add n_regs to the cost, so that we prefer eliminating ivs if possible.  */
-  return n_regs + cost;
+  unsigned cost;
+  unsigned n_old = data->regs_used, n_new = n_invs + n_cands;
+  unsigned regs_needed = n_new + n_old, available_regs = target_avail_regs;
+  bool speed = data->speed;
+
+  /* If there is a call in the loop body, the call-clobbered registers
+ are not available for loop invariants.  */
+  if (data->body_includes_call)
+available_regs = available_regs - target_clobbered_regs;
+
+  /* If we have enough registers.  */
+  if (regs_needed + target_res_regs < available_regs)
+cost = n_new;
+  /* If close to running out of registers, try to preserve them.  */
+  else if (regs_needed <= available_regs)
+

Re: [PATCH GCC8][30/33]Fold more type conversion into binary arithmetic operations

2017-05-15 Thread Bin.Cheng
On Thu, May 11, 2017 at 11:54 AM, Richard Biener
 wrote:
> On Tue, Apr 18, 2017 at 12:53 PM, Bin Cheng  wrote:
>> Hi,
>> Simplification of (T1)(X *+- CST) is already implemented in 
>> aff_combination_expand,
>> this patch moves it to tree_to_aff_combination.  It also supports unsigned 
>> types
>> if range information allows the transformation, as well as special case 
>> (T1)(X + X).
>> Is it OK?
>
> Can you first please simply move it?
>
> +   /* In case X's type has wrapping overflow behavior, we can still
> +  convert (T1)(X - CST) into (T1)X - (T1)CST if X - CST doesn't
> +  overflow by range information.  Also Convert (T1)(X + CST) as
> +  if it's (T1)(X - (-CST)).  */
> +   if (TYPE_UNSIGNED (itype)
> +   && TYPE_OVERFLOW_WRAPS (itype)
> +   && TREE_CODE (op0) == SSA_NAME
> +   && TREE_CODE (op1) == INTEGER_CST
> +   && (icode == PLUS_EXPR || icode == MINUS_EXPR)
> +   && get_range_info (op0, &minv, &maxv) == VR_RANGE)
> + {
> +   if (icode == PLUS_EXPR)
> + op1 = fold_build1 (NEGATE_EXPR, itype, op1);
>
> Negating -INF will produce -INF(OVF) which we don't want to have in our IL,
> I suggest to use
>
>   op1 = wide_int_to_tree (itype, wi::neg (op1));
>
> instead.
>
> +   if (wi::geu_p (minv, op1))
> + {
> +   op0 = fold_convert (otype, op0);
> +   op1 = fold_convert (otype, op1);
> +   expr = fold_build2 (MINUS_EXPR, otype, op0, op1);
> +   tree_to_aff_combination (expr, type, comb);
> +   return;
> + }
> + }
>
> I think this is similar to a part of what Robin Dapp (sp?) is
> proposing as fix for PR69526?
>
> The same trick should work for (int)((unsigned)X - CST) with different
> overflow checks
> (you need to make sure the resulting expr does not overflow).
Hi,
As suggested, I separated the patch into three.  Other review comments
are also addressed.
I read Robin's PR and patch, I think it's two different issues sharing
some aspects, for example, the overflow check using range information
are quite the same.  In effect, this should also captures the result
of Robin's patch because we don't want to fold (T1)(x +- CST) in
general, but here in tree-affine.

Bootstrap and test, is it OK?

Part1:
2017-04-11  Bin Cheng  

(aff_combination_expand): Move (T1)(X *+- CST) simplification to ...
(tree_to_aff_combination): ... here.

Part2:
2017-04-11  Bin Cheng  

* tree-affine.c (tree_to_aff_combination): Handle (T1)(X + X).

Part3:
2017-04-11  Bin Cheng  

* tree-affine.c (ssa.h): Include header file.
(tree_to_aff_combination): Handle (T1)(X - CST) when inner type
has wrapping overflow behavior.

Thanks,
bin
>
> Richard.
>
>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  
>>
>> * tree-affine.c: Include header file.
>> (aff_combination_expand): Move (T1)(X *+- CST) simplification to ...
>> (tree_to_aff_combination): ... here.  Support (T1)(X + X) case, and
>> unsigned type case if range information allows.
From 16697043ffafdd096dce18f8c9e35c1433d809b0 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Thu, 11 May 2017 14:06:32 +0100
Subject: [PATCH 2/9] move-fold-convert-20170510.txt

---
 gcc/tree-affine.c | 53 +++--
 1 file changed, 31 insertions(+), 22 deletions(-)

diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
index 13c477d..cbe2bdb 100644
--- a/gcc/tree-affine.c
+++ b/gcc/tree-affine.c
@@ -363,6 +363,33 @@ tree_to_aff_combination (tree expr, tree type, aff_tree *comb)
   aff_combination_add (comb, &tmp);
   return;
 
+CASE_CONVERT:
+  {
+	tree otype = TREE_TYPE (expr);
+	tree inner = TREE_OPERAND (expr, 0);
+	tree itype = TREE_TYPE (inner);
+	enum tree_code icode = TREE_CODE (inner);
+
+	/* In principle this is a valid folding, but it isn't necessarily
+	   an optimization, so do it here and not in fold_unary.  */
+	if ((icode == PLUS_EXPR || icode == MINUS_EXPR || icode == MULT_EXPR)
+	&& TREE_CODE (itype) == INTEGER_TYPE
+	&& TREE_CODE (otype) == INTEGER_TYPE
+	&& TYPE_PRECISION (otype) > TYPE_PRECISION (itype)
+	&& TYPE_OVERFLOW_UNDEFINED (itype)
+	&& TREE_CODE (TREE_OPERAND (inner, 1)) == INTEGER_CST)
+	  {
+	/* Convert (T1)(X *+- CST) into (T1)X *+- (T1)CST if X's type has
+	   undefined overflow behavior.  */
+	tree op0 = fold_convert (otype, TREE_OPERAND (inner, 0));
+	tree op1 = fold_convert (otype, TREE_OPERAND (inner, 1));
+	expr = fold_build2 (icode, otype, op0, op1);
+	tree_to_aff_combination (expr, type, comb);
+	return;
+	  }
+  }
+  break;
+
 default:
   break;
 }
@@ -639,28 +666,10 @@ aff_combination_expand (aff_tree *comb ATTRIBUTE_UNUSED,
 	  exp = XNEW (struct name_expansion);
 	  exp->in_progress = 1

Re: [PATCH GCC8][31/33]Set range information for niter bound of vectorized loop

2017-05-15 Thread Bin.Cheng
On Thu, May 11, 2017 at 12:02 PM, Richard Biener
 wrote:
> On Tue, Apr 18, 2017 at 12:54 PM, Bin Cheng  wrote
>> Hi,
>> Based on vect_peeling algorithm, we know for sure that vectorized loop must 
>> iterates at least once.
>> This patch sets range information for niter bounds of vectorized loop.  This 
>> helps niter analysis,
>> so iv elimination too.
>> Is it OK?
>
>niters_vector = force_gimple_operand (niters_vector, &stmts, true, 
> var);
>gsi_insert_seq_on_edge_immediate (pe, stmts);
> +  /* Peeling algorithm guarantees that vector loop bound is at least ONE,
> +we set range information to make niters analyzer's life easier.  */
> +  if (TREE_CODE (niters_vector) == SSA_NAME)
> +   set_range_info (niters_vector, VR_RANGE, build_int_cst (type, 1),
> +   fold_build2 (RSHIFT_EXPR, type,
> +TYPE_MAX_VALUE (type), log_vf));
>
> if all of niters_vector folds to an original SSA name then
> niters_vector after gimplification
> is not a new SSA name and thus you can't set range-info on it.
>
> Likewise for the other case where LOOP_VINFO_NITERS is just an SSA name.
Hi,
This is updated patch.  It checks whether the result ssa name is newly
created tmp and only sets range information if so.

Is it OK?

Thanks,
bin

2017-04-11  Bin Cheng  

* tree-vectorizer.h (vect_build_loop_niters): New parameter.
* tree-vect-loop-manip.c (vect_build_loop_niters): New parameter.
Set true to new parameter if new ssa variable is defined.
(vect_gen_vector_loop_niters): Refactor.  Set range information
for the new vector loop bound variable.
(vect_do_peeling): Ditto.

>
> Richard.
>
>> Thanks,
>> bin
>> 2017-04-11  Bin Cheng  
>>
>> * tree-vect-loop-manip.c (vect_gen_vector_loop_niters): Refactor.
>> Set range information for vector loop bound variable.
>> (vect_do_peeling): Ditto.
From fbf6867a1f8d8fed94ce53861ccefb0d704e9f96 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Thu, 11 May 2017 17:26:57 +0100
Subject: [PATCH 5/9] range_info-for-vect_loop-niters-20170225.txt

---
 gcc/tree-vect-loop-manip.c | 52 +-
 gcc/tree-vectorizer.h  |  2 +-
 2 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index f48336b..52cd2bb 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -1095,10 +1095,11 @@ vect_update_inits_of_drs (loop_vec_info loop_vinfo, tree niters)
 
 
 /* This function builds ni_name = number of iterations.  Statements
-   are emitted on the loop preheader edge.  */
+   are emitted on the loop preheader edge.  If NEW_VAR_P is not NULL, set
+   it to TRUE if new ssa_var is generated.  */
 
 tree
-vect_build_loop_niters (loop_vec_info loop_vinfo)
+vect_build_loop_niters (loop_vec_info loop_vinfo, bool *new_var_p)
 {
   tree ni = unshare_expr (LOOP_VINFO_NITERS (loop_vinfo));
   if (TREE_CODE (ni) == INTEGER_CST)
@@ -1114,6 +1115,10 @@ vect_build_loop_niters (loop_vec_info loop_vinfo)
   if (stmts)
 	gsi_insert_seq_on_edge_immediate (pe, stmts);
 
+  if (new_var_p != NULL)
+	*new_var_p = (TREE_CODE (ni_name) == SSA_NAME
+		  && SSA_NAME_VAR (ni_name) == var);
+
   return ni_name;
 }
 }
@@ -1177,22 +1182,21 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo, tree niters,
 			 tree *niters_vector_ptr, bool niters_no_overflow)
 {
   tree ni_minus_gap, var;
-  tree niters_vector;
+  tree niters_vector, type = TREE_TYPE (niters);
   int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   edge pe = loop_preheader_edge (LOOP_VINFO_LOOP (loop_vinfo));
-  tree log_vf = build_int_cst (TREE_TYPE (niters), exact_log2 (vf));
+  tree log_vf = build_int_cst (type, exact_log2 (vf));
 
   /* If epilogue loop is required because of data accesses with gaps, we
  subtract one iteration from the total number of iterations here for
  correct calculation of RATIO.  */
   if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo))
 {
-  ni_minus_gap = fold_build2 (MINUS_EXPR, TREE_TYPE (niters),
-  niters,
-  build_one_cst (TREE_TYPE (niters)));
+  ni_minus_gap = fold_build2 (MINUS_EXPR, type, niters,
+  build_one_cst (type));
   if (!is_gimple_val (ni_minus_gap))
 	{
-	  var = create_tmp_var (TREE_TYPE (niters), "ni_gap");
+	  var = create_tmp_var (type, "ni_gap");
 	  gimple *stmts = NULL;
 	  ni_minus_gap = force_gimple_operand (ni_minus_gap, &stmts,
 	   true, var);
@@ -1208,25 +1212,29 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo, tree niters,
  (niters - vf) >> log2(vf) + 1 by using the fact that we know ratio
  will be at least one.  */
   if (niters_no_overflow)
-niters_vector = fold_build2 (RSHIFT_EXPR, TREE_TYPE (niters),
- ni_minus_gap, log_vf);
+niters_vector = fold_build2 (RSHIFT_EXPR, type, ni_minus_gap, log_vf);
   else
 niters_vector
-  = fold_build2 (PLUS_EXPR, TREE_TY

Re: [PATCH, rs6000] Fold vector logicals in GIMPLE

2017-05-15 Thread Will Schmidt
On Sat, 2017-05-13 at 18:03 -0700, David Edelsohn wrote:
> On Thu, May 11, 2017 at 3:09 PM, Segher Boessenkool
>  wrote:
> > On Thu, May 11, 2017 at 02:36:26PM -0500, Will Schmidt wrote:
> >> On Thu, 2017-05-11 at 14:15 -0500, Segher Boessenkool wrote:
> >> > Hi!
> >> >
> >> > On Thu, May 11, 2017 at 10:53:33AM -0500, Will Schmidt wrote:
> >> > > Add handling for early expansion of vector locical operations in 
> >> > > gimple.
> >> > > Specifically: vec_and, vec_andc, vec_or, vec_xor, vec_orc, vec_nand.
> >> >
> >> > You also handle nor (except in the changelog).  But what about eqv?
> >>
> >> Right, in my excitement I lost my 'vec_nor', that one should be
> >> mentioned as well.
> >>
> >> vec_eqv() I have as a later patch in my series, it will be showing up
> >> once this first bunch are in.
> >
> > Ah cool -- fine with the changelog fix then.  Thanks!
> 
> Will,
> 
> All of the testcases are failing on AIX.  Most are direct fails, but
> some are complaining about implicit declaration of a function.
> 
> I thought that we had determined the correct gcc testsuite target
> selectors.  Something is not correct with the new tests.
> 
> The errors about undeclared function are:
> 
> FAIL: gcc.target/powerpc/fold-vec-div-float.c (test for excess errors)
> 
> Excess errors:
> 
> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
> 
> 13:10: warning: implicit declaration of function 'vec_div'; did you
> mean 'vec_dss'? [-Wimplicit-function-declaration]
> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
> 13:3: error: AltiVec argument passed to unprototyped function
> 
> and
> 
> FAIL: gcc.target/powerpc/fold-vec-div-floatdouble.c (test for excess errors)
> 
> Excess errors:
> 
> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-floatdouble.c:10:8:
> error: expected '=', ',', ';', 'asm' or '__attribute__' before
> 'double'
> 
> Would you please look into this and fix it?

Yes.  

I've started a checkout on gcc119.  Is that the right environment I
should poke around in, or is there a preferred or recommended
alternative?

Thanks,

-Will

> 
> Thanks, David
> 




[PATCH, GCC/ARM] Fix comment for cmse_nonsecure_call_clear_caller_saved

2017-05-15 Thread Thomas Preudhomme

Hi,

Saving, clearing and restoring of *callee-saved* registers when doing a
cmse_nonsecure_call is done in the __gnu_cmse_nonsecure_call libcall.
Yet, comments for cmse_nonsecure_call_clear_caller_saved claim that
it is this function that does these actions.

This commit fixes the comment to point to the __gnu_cmse_nonsecure_call
libcall instead.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-05-12  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Refer
readers to __gnu_cmse_nonsecure_call libcall for saving, clearing and
restoring of callee-saved registers.

Given that this is just a comment fix, is this ok for trunk and
gcc-7-branch after a one day delay?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3ae999cb723a234332bd1ca2e80b09df240c67f0..a888e706004d4bd17be2dfc41237bbea691dccb0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16909,9 +16909,10 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
   return not_to_clear_mask;
 }
 
-/* Saves callee saved registers, clears callee saved registers and caller saved
-   registers not used to pass arguments before a cmse_nonsecure_call.  And
-   restores the callee saved registers after.  */
+/* Clears caller saved registers not used to pass arguments before a
+   cmse_nonsecure_call.  Saving, clearing and restoring of callee saved
+   registers is done in __gnu_cmse_nonsecure_call libcall.
+   See libgcc/config/arm/cmse_nonsecure_call.S.  */
 
 static void
 cmse_nonsecure_call_clear_caller_saved (void)


Fix tilepro and tilegx WRT recent changes to extract_bit_field

2017-05-15 Thread Jeff Law


Joseph's tester as well as mine tripped over this nit over the weekend. 
Another argument was recently added to extract_bit_field, but the calls 
in the tilegx and tilepro backends were not fixed.


This was enough to get the ports building again, including target-libgcc 
and glibc.



Installing on the trunk.

Jeff
* config/tilegx/tilegx.c (tilegx_expand_unaligned_load): Add
missing argument to extract_bit_field call.
* config/tilepro/tilepro.c (tilepro_expand_unaligned_load): Likewise.

diff --git a/gcc/config/tilegx/tilegx.c b/gcc/config/tilegx/tilegx.c
index d8ca14b..e070e7e 100644
--- a/gcc/config/tilegx/tilegx.c
+++ b/gcc/config/tilegx/tilegx.c
@@ -1959,7 +1959,7 @@ tilegx_expand_unaligned_load (rtx dest_reg, rtx mem, 
HOST_WIDE_INT bitsize,
extract_bit_field (gen_lowpart (DImode, wide_result),
   bitsize, bit_offset % BITS_PER_UNIT,
   !sign, gen_lowpart (DImode, dest_reg),
-  DImode, DImode, false);
+  DImode, DImode, false, NULL);
 
   if (extracted != dest_reg)
emit_move_insn (dest_reg, gen_lowpart (DImode, extracted));
diff --git a/gcc/config/tilepro/tilepro.c b/gcc/config/tilepro/tilepro.c
index aa1bb1c..81019c1 100644
--- a/gcc/config/tilepro/tilepro.c
+++ b/gcc/config/tilepro/tilepro.c
@@ -1688,7 +1688,7 @@ tilepro_expand_unaligned_load (rtx dest_reg, rtx mem, 
HOST_WIDE_INT bitsize,
extract_bit_field (gen_lowpart (SImode, wide_result),
   bitsize, bit_offset % BITS_PER_UNIT,
   !sign, gen_lowpart (SImode, dest_reg),
-  SImode, SImode, false);
+  SImode, SImode, false, NULL);
 
   if (extracted != dest_reg)
emit_move_insn (dest_reg, gen_lowpart (SImode, extracted));


Re: [patch, libfortran] Fix amount of memory allocation for matrix - vector calculation

2017-05-15 Thread Janne Blomqvist
On Fri, May 12, 2017 at 7:57 PM, Thomas Koenig  wrote:
> Am 12.05.2017 um 10:16 schrieb Janne Blomqvist:
>>
>> On Fri, May 12, 2017 at 1:14 AM, Thomas Koenig 
>> wrote:
>>>
>>> Hello world,
>>>
>>> the memory allocation for the buffer in the library matmul
>>> routines still has one problem: The value of 0xdeadbeef meant
>>> as poison could end up in the calculation of the size of the
>>> buffer for the blocked matmul.
>>>
>>> The attached patch fixes that. Verified with regression-test,
>>> also by running a few select test cases under valgrind.
>>>
>>> No test case because nothing appeared to fail.
>>>
>>> OK for trunk?
>>
>>
>> Patch missing?
>
>
> Well, yes.
>
> Here it is.

Looks good, Ok for trunk.

Also add PR 80765 to the changelog per HJL's message.


-- 
Janne Blomqvist


[PATCH] Do not silently continue if config.{build,host,gcc} fails

2017-05-15 Thread Segher Boessenkool
If config.{build,host,gcc} fails, configure currently silently
continues.  This then makes it much harder than necessary to notice
you made a stupid pasto in config.gcc (and where exactly).

This patch fixes it, by terminating if one of the config.* fails.

Testing in progress (on powerpc64-linux); is this okay for trunk if
it passes?


Segher


2017-05-15  Segher Boessenkool  

* configure.ac: If any of the config.* scripts fail, exit 1.
* configure: Regenerate.

---
 gcc/configure| 6 +++---
 gcc/configure.ac | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index c823ffe..f3c7465 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -11489,15 +11489,15 @@ fi
 # 
 
 # Collect build-machine-specific information.
-. ${srcdir}/config.build
+. ${srcdir}/config.build || exit 1
 
 # Collect host-machine-specific information.
-. ${srcdir}/config.host
+. ${srcdir}/config.host || exit 1
 
 target_gtfiles=
 
 # Collect target-machine-specific information.
-. ${srcdir}/config.gcc
+. ${srcdir}/config.gcc || exit 1
 
 extra_objs="${host_extra_objs} ${extra_objs}"
 extra_gcc_objs="${host_extra_gcc_objs} ${extra_gcc_objs}"
diff --git a/gcc/configure.ac b/gcc/configure.ac
index acfe979..c24e83f 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1442,15 +1442,15 @@ fi
 # 
 
 # Collect build-machine-specific information.
-. ${srcdir}/config.build
+. ${srcdir}/config.build || exit 1
 
 # Collect host-machine-specific information.
-. ${srcdir}/config.host
+. ${srcdir}/config.host || exit 1
 
 target_gtfiles=
 
 # Collect target-machine-specific information.
-. ${srcdir}/config.gcc
+. ${srcdir}/config.gcc || exit 1
 
 extra_objs="${host_extra_objs} ${extra_objs}"
 extra_gcc_objs="${host_extra_gcc_objs} ${extra_gcc_objs}"
-- 
1.9.3



Re: [PATCH] [ARC] Recognise add_n and sub_n in combine again

2017-05-15 Thread Claudiu Zissulescu
Hi Graham,

May I ask if you tested this with gcc's dejagnu?

Thanks,
Claudiu

On Fri, May 12, 2017 at 9:14 PM, Graham Markall
 wrote:
> Since the combine pass canonicalises shift-add insns using plus and
> ashift (as opposed to plus and mult which it previously used to do), it
> no longer creates *add_n or *sub_n insns, as the patterns match plus and
> mult only. The outcome of this is that some opportunities to generate
> add{1,2,3} and sub{1,2,3} instructions are missed.
>
> This change adds additional *add_n and *sub_n insns that match the
> plus-ashift pattern. The original *add_n and *sub_n insns are still left
> in, as they are sometimes generated later on by constant propagation.
> The idea of adding these insns is modelled on the changes in:
>
>   https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01882.html
>
> which addresses a similar issue for the PA target.
>
> For the small test cases that are added, even if the combine pass misses
> the opportunity to generate addN or subN, constant propagation manages
> to do so, so the rtl of the combine pass is checked.
>
> gcc/ChangeLog:
>
> * config/arc/arc.c (arc_print_operand): Handle constant operands.
> (arc_rtx_costs): Add costs for new patterns.
> * config/arc/arc.md: Additional *add_n and *sub_n patterns.
> * config/arc/predicates.md: Add _1_2_3_operand predicate.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/arc/add_n-combine.c: New.
> * gcc.target/arc/sub_n-combine.c: New.
> ---
>  gcc/ChangeLog|  7 
>  gcc/config/arc/arc.c | 20 +---
>  gcc/config/arc/arc.md| 26 +++
>  gcc/config/arc/predicates.md |  5 +++
>  gcc/testsuite/ChangeLog  |  5 +++
>  gcc/testsuite/gcc.target/arc/add_n-combine.c | 48 
> 
>  gcc/testsuite/gcc.target/arc/sub_n-combine.c | 21 
>  7 files changed, 128 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arc/add_n-combine.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/sub_n-combine.c
>
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 91c28e7..42730d5 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -3483,6 +3483,14 @@ arc_print_operand (FILE *file, rtx x, int code)
>
>return;
>
> +case 'c':
> +  if (GET_CODE (x) == CONST_INT)
> +fprintf (file, "%d", INTVAL (x) );
> +  else
> +output_operand_lossage ("invalid operands to %%c code");
> +
> +  return;
> +
>  case 'M':
>if (GET_CODE (x) == CONST_INT)
> fprintf (file, "%d",exact_log2(~INTVAL (x)) );
> @@ -4895,8 +4903,10 @@ arc_rtx_costs (rtx x, machine_mode mode, int 
> outer_code,
> *total = COSTS_N_INSNS (2);
>return false;
>  case PLUS:
> -  if (GET_CODE (XEXP (x, 0)) == MULT
> - && _2_4_8_operand (XEXP (XEXP (x, 0), 1), VOIDmode))
> +  if ((GET_CODE (XEXP (x, 0)) == ASHIFT
> +  && _1_2_3_operand (XEXP (XEXP (x, 0), 1), VOIDmode))
> +  || (GET_CODE (XEXP (x, 0)) == MULT
> +  && _2_4_8_operand (XEXP (XEXP (x, 0), 1), VOIDmode)))
> {
>   *total += (rtx_cost (XEXP (x, 1), mode, PLUS, 0, speed)
>  + rtx_cost (XEXP (XEXP (x, 0), 0), mode, PLUS, 1, 
> speed));
> @@ -4904,8 +4914,10 @@ arc_rtx_costs (rtx x, machine_mode mode, int 
> outer_code,
> }
>return false;
>  case MINUS:
> -  if (GET_CODE (XEXP (x, 1)) == MULT
> - && _2_4_8_operand (XEXP (XEXP (x, 1), 1), VOIDmode))
> +  if ((GET_CODE (XEXP (x, 1)) == ASHIFT
> +  && _1_2_3_operand (XEXP (XEXP (x, 1), 1), VOIDmode))
> +  || (GET_CODE (XEXP (x, 1)) == MULT
> +  && _2_4_8_operand (XEXP (XEXP (x, 1), 1), VOIDmode)))
> {
>   *total += (rtx_cost (XEXP (x, 0), mode, PLUS, 0, speed)
>  + rtx_cost (XEXP (XEXP (x, 1), 0), mode, PLUS, 1, 
> speed));
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index edb983f..ec783a0 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -2995,6 +2995,19 @@
>
>  (define_insn "*add_n"
>[(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,Rcw,W,W,w,w")
> +   (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" 
> "Rcqq,c,c,c,c,c")
> +   (match_operand:SI 2 "_1_2_3_operand" ""))
> +(match_operand:SI 3 "nonmemory_operand" 
> "0,0,c,?Cal,?c,??Cal")))]
> +  ""
> +  "add%c2%? %0,%3,%1%&"
> +  [(set_attr "type" "shift")
> +   (set_attr "length" "*,4,4,8,4,8")
> +   (set_attr "predicable" "yes,yes,no,no,no,no")
> +   (set_attr "cond" "canuse,canuse,nocond,nocond,nocond,nocond")
> +   (set_attr "iscompact" "maybe,false,false,false,false,false")])
> +
> +(define_insn "*add_n"
> +  [(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,Rcw,W,W,w,w")
> (plus:SI (mult:SI (match_ope

Re: [libstdc++] Assertion in optional

2017-05-15 Thread Jonathan Wakely

On 14/05/17 09:19 +0200, Marc Glisse wrote:

Hello,

this patch adds 2 simple __glibcxx_assert in optional that match the 
precondition in the comment above. I am not sure if there was a reason 
the author wrote that comment instead of the assertion, but constexpr 
use still seems to work.


Yes, in a constexpr context we get the following when it's not
engaged:

In file included from /home/jwakely/gcc/8/include/c++/8.0.0/utility:68:0,
from /home/jwakely/gcc/8/include/c++/8.0.0/optional:36,
from opt.cc:1:
/home/jwakely/gcc/8/include/c++/8.0.0/optional: In function ‘int main()’:
opt.cc:11:22:   in constexpr expansion of ‘f()’
opt.cc:6:11:   in constexpr expansion of ‘o.std::optional::operator*()’
/home/jwakely/gcc/8/include/c++/8.0.0/optional:708:29:   in constexpr expansion of 
‘((std::optional*)this)->std::optional::.std::_Optional_base::_M_get()’
/home/jwakely/gcc/8/include/c++/8.0.0/optional:390:2: error: call to 
non-constexpr function ‘void std::__replacement_assert(const char*, int, const 
char*, const char*)’
 __glibcxx_assert(_M_is_engaged());
 ^

I think that's an improvement over what we have now:

opt.cc: In function ‘int main()’:
opt.cc:11:22:   in constexpr expansion of ‘f()’
opt.cc:11:23: error: accessing ‘std::_Optional_payload_M_payload’ member instead of initialized ‘std::_Optional_payload_M_empty’ member in constant expression
  constexpr int i = f();
  ^


So the patch is OK for trunk, thanks.




Re: [PATCH] [ARC] Recognise add_n and sub_n in combine again

2017-05-15 Thread Graham Markall
Hi Claudiu,

I ran the gcc testsuite with EZsim for NPS-400:

$ ./EZsim_linux_x86_64 --version
NPS-400 EZsim  - Version 1.9a ( 35b02d7, Nov  3 2015, 20:14:04 )

both with and without the patch, and it did not introduce any new failures.


Best regards,
Graham.

On 15/05/17 17:48, Claudiu Zissulescu wrote:
> Hi Graham,
> 
> May I ask if you tested this with gcc's dejagnu?
> 
> Thanks,
> Claudiu
> 
> On Fri, May 12, 2017 at 9:14 PM, Graham Markall
>  wrote:
>> Since the combine pass canonicalises shift-add insns using plus and
>> ashift (as opposed to plus and mult which it previously used to do), it
>> no longer creates *add_n or *sub_n insns, as the patterns match plus and
>> mult only. The outcome of this is that some opportunities to generate
>> add{1,2,3} and sub{1,2,3} instructions are missed.
>>
>> This change adds additional *add_n and *sub_n insns that match the
>> plus-ashift pattern. The original *add_n and *sub_n insns are still left
>> in, as they are sometimes generated later on by constant propagation.
>> The idea of adding these insns is modelled on the changes in:
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01882.html
>>
>> which addresses a similar issue for the PA target.
>>
>> For the small test cases that are added, even if the combine pass misses
>> the opportunity to generate addN or subN, constant propagation manages
>> to do so, so the rtl of the combine pass is checked.
>>
>> gcc/ChangeLog:
>>
>> * config/arc/arc.c (arc_print_operand): Handle constant operands.
>> (arc_rtx_costs): Add costs for new patterns.
>> * config/arc/arc.md: Additional *add_n and *sub_n patterns.
>> * config/arc/predicates.md: Add _1_2_3_operand predicate.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/arc/add_n-combine.c: New.
>> * gcc.target/arc/sub_n-combine.c: New.
>> ---
>>  gcc/ChangeLog|  7 
>>  gcc/config/arc/arc.c | 20 +---
>>  gcc/config/arc/arc.md| 26 +++
>>  gcc/config/arc/predicates.md |  5 +++
>>  gcc/testsuite/ChangeLog  |  5 +++
>>  gcc/testsuite/gcc.target/arc/add_n-combine.c | 48 
>> 
>>  gcc/testsuite/gcc.target/arc/sub_n-combine.c | 21 
>>  7 files changed, 128 insertions(+), 4 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/arc/add_n-combine.c
>>  create mode 100644 gcc/testsuite/gcc.target/arc/sub_n-combine.c
>>
>> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
>> index 91c28e7..42730d5 100644
>> --- a/gcc/config/arc/arc.c
>> +++ b/gcc/config/arc/arc.c
>> @@ -3483,6 +3483,14 @@ arc_print_operand (FILE *file, rtx x, int code)
>>
>>return;
>>
>> +case 'c':
>> +  if (GET_CODE (x) == CONST_INT)
>> +fprintf (file, "%d", INTVAL (x) );
>> +  else
>> +output_operand_lossage ("invalid operands to %%c code");
>> +
>> +  return;
>> +
>>  case 'M':
>>if (GET_CODE (x) == CONST_INT)
>> fprintf (file, "%d",exact_log2(~INTVAL (x)) );
>> @@ -4895,8 +4903,10 @@ arc_rtx_costs (rtx x, machine_mode mode, int 
>> outer_code,
>> *total = COSTS_N_INSNS (2);
>>return false;
>>  case PLUS:
>> -  if (GET_CODE (XEXP (x, 0)) == MULT
>> - && _2_4_8_operand (XEXP (XEXP (x, 0), 1), VOIDmode))
>> +  if ((GET_CODE (XEXP (x, 0)) == ASHIFT
>> +  && _1_2_3_operand (XEXP (XEXP (x, 0), 1), VOIDmode))
>> +  || (GET_CODE (XEXP (x, 0)) == MULT
>> +  && _2_4_8_operand (XEXP (XEXP (x, 0), 1), VOIDmode)))
>> {
>>   *total += (rtx_cost (XEXP (x, 1), mode, PLUS, 0, speed)
>>  + rtx_cost (XEXP (XEXP (x, 0), 0), mode, PLUS, 1, 
>> speed));
>> @@ -4904,8 +4914,10 @@ arc_rtx_costs (rtx x, machine_mode mode, int 
>> outer_code,
>> }
>>return false;
>>  case MINUS:
>> -  if (GET_CODE (XEXP (x, 1)) == MULT
>> - && _2_4_8_operand (XEXP (XEXP (x, 1), 1), VOIDmode))
>> +  if ((GET_CODE (XEXP (x, 1)) == ASHIFT
>> +  && _1_2_3_operand (XEXP (XEXP (x, 1), 1), VOIDmode))
>> +  || (GET_CODE (XEXP (x, 1)) == MULT
>> +  && _2_4_8_operand (XEXP (XEXP (x, 1), 1), VOIDmode)))
>> {
>>   *total += (rtx_cost (XEXP (x, 0), mode, PLUS, 0, speed)
>>  + rtx_cost (XEXP (XEXP (x, 1), 0), mode, PLUS, 1, 
>> speed));
>> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
>> index edb983f..ec783a0 100644
>> --- a/gcc/config/arc/arc.md
>> +++ b/gcc/config/arc/arc.md
>> @@ -2995,6 +2995,19 @@
>>
>>  (define_insn "*add_n"
>>[(set (match_operand:SI 0 "dest_reg_operand" "=Rcqq,Rcw,W,W,w,w")
>> +   (plus:SI (ashift:SI (match_operand:SI 1 "register_operand" 
>> "Rcqq,c,c,c,c,c")
>> +   (match_operand:SI 2 "_1_2_3_operand" ""))
>> +(match_operand:SI 3 "nonmemory_operand" 
>> "0,0,c,?Cal,?c,??Cal")))]

Fix minor reorg.c bug affecting MIPS targets

2017-05-15 Thread Jeff Law



There's a subtle bug in reorg.c's relax_delay_slots that my tester 
tripped this weekend.  Not sure what changed code generation wise as the 
affected port built just fine last week.  But it is what is is.




Assume before this code we've set TARGET_LABEL to the code_label 
associated with DELAY_JUMP_INSN (which is what we want)...




 /* If the first insn at TARGET_LABEL is redundant with a previous
 insn, redirect the jump to the following insn and process again.
 We use next_real_insn instead of next_active_insn so we
 don't skip USE-markers, or we'll end up with incorrect
 liveness info.  */

[ ... ]

 /* Similarly, if it is an unconditional jump with one insn in its
 delay list and that insn is redundant, thread the jump.  */
  rtx_sequence *trial_seq =
trial ? dyn_cast  (PATTERN (trial)) : NULL;
  if (trial_seq
  && trial_seq->len () == 2
  && JUMP_P (trial_seq->insn (0))
  && simplejump_or_return_p (trial_seq->insn (0))
  && redundant_insn (trial_seq->insn (1), insn, vNULL))
{
  target_label = JUMP_LABEL (trial_seq->insn (0));
  if (ANY_RETURN_P (target_label))
target_label = find_end_label (target_label);

  if (target_label
  && redirect_with_delay_slots_safe_p (delay_jump_insn,
   target_label, insn))
{
  update_block (trial_seq->insn (1), insn);
  reorg_redirect_jump (delay_jump_insn, target_label);
  next = insn;
  continue;
}
}

  /* See if we have a simple (conditional) jump that is useless.  */
  if (! INSN_ANNULLED_BRANCH_P (delay_jump_insn)
  && ! condjump_in_parallel_p (delay_jump_insn)
  && prev_active_insn (as_a (target_label)) == insn

Now assume that we get into the TRUE arm of that first conditional which 
sets a new value for TARGET_LABEL.  Normally when this happens in 
relax_delay_slots we're going to unconditionally continue.  But as we 
can see there's an inner conditional and if we don't get into its true 
arm, then we'll pop out a nesting level and execute the second outer IF.


That second outer IF assumes that TARGET_LABEL still points to the code 
label associated with DELAY_JUMP_INSN.  Opps.  In my particular case it 
was NULL and caused an ICE.  But I could probably construct a case where 
it pointed to a real label and could result in incorrect code generation.


The fix is pretty simple.   Just creating a new variable (to avoid 
-Wshadow) inside that first outer IF is sufficient.


Tested by building the affected target (mipsisa64r2el-elf) through newlib.

Installed on the trunk.

jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8cceb247a85..18b6ed59c73 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2017-05-15  Jeff Law  
 
+   * reorg.c (relax_delay_slots): Create a new variable to hold
+   the temporary target rather than clobbering TARGET_LABEL.
+
* config/tilegx/tilegx.c (tilegx_expand_unaligned_load): Add
missing argument to extract_bit_field call.
* config/tilepro/tilepro.c (tilepro_expand_unaligned_load): Likewise.
diff --git a/gcc/reorg.c b/gcc/reorg.c
index 85ef7d6880c..1a6fd86e286 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -3351,16 +3351,16 @@ relax_delay_slots (rtx_insn *first)
  && simplejump_or_return_p (trial_seq->insn (0))
  && redundant_insn (trial_seq->insn (1), insn, vNULL))
{
- target_label = JUMP_LABEL (trial_seq->insn (0));
- if (ANY_RETURN_P (target_label))
-   target_label = find_end_label (target_label);
+ rtx temp_label = JUMP_LABEL (trial_seq->insn (0));
+ if (ANY_RETURN_P (temp_label))
+   temp_label = find_end_label (temp_label);
  
- if (target_label
+ if (temp_label
  && redirect_with_delay_slots_safe_p (delay_jump_insn,
-  target_label, insn))
+  temp_label, insn))
{
  update_block (trial_seq->insn (1), insn);
- reorg_redirect_jump (delay_jump_insn, target_label);
+ reorg_redirect_jump (delay_jump_insn, temp_label);
  next = insn;
  continue;
}
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f198fc6b42b..9fb8c8598b8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2017-05-15  Jeff Law  
+
+   * gcc.target/mips/reorgbug-1.c: New test.
+
 2017-05-15  Pierre-Marie de Rodat  
 
* gnat.dg/specs/pack13.ads: New test.
diff --git a/gcc/testsuite/gcc.target/mips/reorgbug-1.c 
b/gcc/testsuite/gcc.target/mips/reorgbug-1.c
new file mode 100644
index 000..b820a2b5df1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/reorgbug-1.c
@@ -0,0

Default std::vector default and move constructor

2017-05-15 Thread François Dumont

Hi

Following what I have started on RbTree here is a patch to default 
implementation of default and move constructors on std::vector.


As in _Rb_tree_impl the default allocator is not value initialized 
anymore. We could add a small helper type arround the allocator to do 
this value initialization per default. Should I do so ?


I also added some optimizations. Especially replacement of 
std::fill with calls to __builtin_memset. Has anyone ever proposed to 
optimize std::fill in such a way ? It would require a test on the value 
used to fill the range but it might worth this additional runtime check, 
no ?


* include/bits/stl_bvector.h (_Bvector_impl_data): New.
(_Bvector_impl): Inherits from latter.
(_Bvector_impl(_Bit_alloc_type&&)): Delete.
(_Bvector_impl(_Bvector_impl&&)): New, default.
(_Bvector_base()): Default.
(_Bvector_base(_Bvector_base&&)): Default.
(_Bvector_base::_M_move_data(_Bvector_base&&)): New.
(vector(vector&&, const allocator_type&)): Use latter.
(vector::operator=(vector&&)): Likewise.
(vector::vector()): Default.
(vector::assign(_InputIterator, _InputIterator)): Use
_M_assign_aux.
(vector::assign(initializer_list)): Likewise.
(vector::_M_initialize_value(bool)): New.
(vector(size_type, const bool&, const allocator_type&)): Use
latter.
(vector::_M_initialize_dispatch(_Integer, _Integer, 
__true_type)):

Likewise.
(vector::_M_fill_assign(size_t, bool)): Likewise.

Tested under Linux x86_64 normal mode, with and without versioned 
namespace.


Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index 37e000a..a6afced 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -399,8 +399,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 if (__first._M_p != __last._M_p)
   {
-	std::fill(__first._M_p + 1, __last._M_p, __x ? ~0 : 0);
-	__fill_bvector(__first, _Bit_iterator(__first._M_p + 1, 0), __x);
+	_Bit_type *__first_p = __first._M_p;
+	if (__first._M_offset != 0)
+	  __fill_bvector(__first, _Bit_iterator(++__first_p, 0), __x);
+
+	__builtin_memset(__first_p, __x ? ~0 : 0,
+			 (__last._M_p - __first_p) * sizeof(_Bit_type));
+
+	if (__last._M_offset != 0)
 	  __fill_bvector(_Bit_iterator(__last._M_p, 0), __last, __x);
   }
 else
@@ -416,33 +422,66 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	_Bit_alloc_traits;
   typedef typename _Bit_alloc_traits::pointer _Bit_pointer;
 
-  struct _Bvector_impl
-  : public _Bit_alloc_type
+  struct _Bvector_impl_data
   {
 	_Bit_iterator 	_M_start;
 	_Bit_iterator 	_M_finish;
 	_Bit_pointer 	_M_end_of_storage;
 
+	_Bvector_impl_data() _GLIBCXX_NOEXCEPT
+	: _M_start(), _M_finish(), _M_end_of_storage()
+	{ }
+
+#if __cplusplus >= 201103L
+	_Bvector_impl_data(_Bvector_impl_data&& __x) noexcept
+	: _M_start(__x._M_start), _M_finish(__x._M_finish)
+	, _M_end_of_storage(__x._M_end_of_storage)
+	{ __x._M_reset(); }
+
+	void
+	_M_move_data(_Bvector_impl_data&& __x) noexcept
+	{
+	  this->_M_start = __x._M_start;
+	  this->_M_finish = __x._M_finish;
+	  this->_M_end_of_storage = __x._M_end_of_storage;
+	  __x._M_reset();
+	}
+
+	void
+	_M_reset() noexcept
+	{
+	  this->_M_start = _Bit_iterator();
+	  this->_M_finish = _Bit_iterator();
+	  this->_M_end_of_storage = nullptr;
+	}
+#endif
+  };
+
+  struct _Bvector_impl
+	: public _Bit_alloc_type, public _Bvector_impl_data
+  {
+  public:
+#if __cplusplus >= 201103L
+	_Bvector_impl() = default;
+#else
 	_Bvector_impl()
-	: _Bit_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage()
+	: _Bit_alloc_type()
 	{ }
+#endif
  
 	_Bvector_impl(const _Bit_alloc_type& __a)
-	: _Bit_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage()
+	: _Bit_alloc_type(__a)
 	{ }
 
 #if __cplusplus >= 201103L
-	_Bvector_impl(_Bit_alloc_type&& __a)
-	: _Bit_alloc_type(std::move(__a)), _M_start(), _M_finish(),
-	  _M_end_of_storage()
-	{ }
+	_Bvector_impl(_Bvector_impl&&) = default;
 #endif
 
 	_Bit_type*
 	_M_end_addr() const _GLIBCXX_NOEXCEPT
 	{
-	  if (_M_end_of_storage)
-	return std::__addressof(_M_end_of_storage[-1]) + 1;
+	  if (this->_M_end_of_storage)
+	return std::__addressof(this->_M_end_of_storage[-1]) + 1;
 	  return 0;
 	}
   };
@@ -462,23 +501,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Bit_allocator()); }
 
+#if __cplusplus >= 201103L
+  _Bvector_base() = default;
+#else
   _Bvector_base()
   : _M_impl() { }
+#endif
   
   _Bvector_base(const allocator_type& __a)
   : _M_impl(__a) { }
 
 #if __cplusplus >= 201103L
-  _Bvector_base(_Bvector_base&& __x) noexcept
-  : _M_impl(std::move(__x._M_get_Bit_allocator()))
-  {
-	this->_M_impl._M_start = __x._M_impl._M_start;
-	this->_M_impl._M_finish = __x._M_impl._M_finish;
-	this->_M_impl._M_end_o

Re: [patch] build xz (instead of bz2) compressed tarballs and diffs

2017-05-15 Thread Jakub Jelinek
On Mon, May 15, 2017 at 04:13:44PM +0200, Markus Trippelsdorf wrote:
> On 2017.05.15 at 14:02 +, Joseph Myers wrote:
> > The xz manpage warns against blindly using -9 (for which --best is a 
> > deprecated alias) because of the implications for memory requirements for 
> > decompressing.  If there's a reason it's considered appropriate here, I 
> > think it needs an explanatory comment.
> 
> I think it is unacceptable, because it would increase memory usage when
> decompressing over 20x compared to bz2 (and over 100x while compressing).

The memory using during compressing isn't that interesting as long as it
isn't prohibitive for sourceware or the machines RMs use.
For the decompression, I guess it matters what is actually the memory needed
for decompression the -9 gcc tarball, and compare that to minimal memory
requirements to compile (not bootstrap) the compiler using typical system
compilers.  If compilation of gcc takes more memory than the decompression,
then it should be fine, why would anyone try to decompress gcc not to build
it afterwards?

Jakub


Re: [PATCH] Two DW_OP_GNU_variable_value fixes / tweaks

2017-05-15 Thread Jakub Jelinek
On Mon, May 15, 2017 at 02:43:43PM +0200, Richard Biener wrote:
> 
> While bringing early LTO debug up-to-speed I noticed the following two
> issues.  The first patch avoids useless work in 
> note_variable_value_in_expr and actually makes use of the DIE we
> eventually create in resolve_variable_value_in_expr.
> 
> The second avoids generating a DW_OP_GNU_variable_value for a decl
> we'll never have a DIE for -- this might be less obvious than the
> other patch but I think we'll also never have any other debug info
> we can resolve the decl with(?)
> 
> Bootstrapped and tested the first patch sofar
> on x86_64-unknown-linux-gnu, 2nd is pending.
> 
> Ok?
> 
> Thanks,
> Richard.
> 
> 2017-05-15  Richard Biener  
> 
>   * dwarf2out.c (resolve_variable_value_in_expr): Lookup DIE
>   just generated.
>   (note_variable_value_in_expr): If we resolved the decl ref
>   do not push to the stack.

LGTM.

> 2017-05-15  Richard Biener  
> 
>   * dwarf2out.c (loc_list_from_tree_1): Do not create
>   DW_OP_GNU_variable_value for DECL_IGNORED_P decls.

Can you verify it e.g. in a bootstrapped cc1plus doesn't change the debug info
(except for dwarf2out.o itself)?
It looks reasonable and it would surprise me if did, just want to be sure.

Jakub


Re: [patch] build xz (instead of bz2) compressed tarballs and diffs

2017-05-15 Thread Markus Trippelsdorf
On 2017.05.15 at 16:24 +0200, Jakub Jelinek wrote:
> On Mon, May 15, 2017 at 04:13:44PM +0200, Markus Trippelsdorf wrote:
> > On 2017.05.15 at 14:02 +, Joseph Myers wrote:
> > > The xz manpage warns against blindly using -9 (for which --best is a 
> > > deprecated alias) because of the implications for memory requirements for 
> > > decompressing.  If there's a reason it's considered appropriate here, I 
> > > think it needs an explanatory comment.
> > 
> > I think it is unacceptable, because it would increase memory usage when
> > decompressing over 20x compared to bz2 (and over 100x while compressing).
> 
> The memory using during compressing isn't that interesting as long as it
> isn't prohibitive for sourceware or the machines RMs use.
> For the decompression, I guess it matters what is actually the memory needed
> for decompression the -9 gcc tarball, and compare that to minimal memory
> requirements to compile (not bootstrap) the compiler using typical system
> compilers.  If compilation of gcc takes more memory than the decompression,
> then it should be fine, why would anyone try to decompress gcc not to build
> it afterwards?

Ok, it doesn't really matter. With gcc-7.1 tarball:

size: 533084160 (uncompressed)

-9:
 xz -d gcc.tar.xz
4.71user 0.26system 0:04.97elapsed 100%CPU (0avgtext+0avgdata 67804maxresident)k
size: 60806928

-6 (default):
 xz -d gcc.tar.xz
4.88user 0.28system 0:05.17elapsed 99%CPU (0avgtext+0avgdata 10324maxresident)k
size: 65059664

So -9 is actually just fine.

-- 
Markus


[PATCH, i386]: Fix PR 80425, Extra inter-unit register move with zero-extension

2017-05-15 Thread Uros Bizjak
Hello!

Attached patch introduces peephole2 pattern to avoid intermediate
DImode register in interunit zero-extend sequence.

However, it looks there is still slight problem with RA. Without
-mtune=intel, we have direct GR->XMM interunit moves disabled, but
pr80425-2.c testcase compiles to:

movla(%rip), %eax
movq%rax, -56(%rbp)
vmovq   -56(%rbp), %xmm1

The compiler could emit a direct mem->XMM zero-extending move, without
intermediate stack slot.

2017-05-15  Uros Bizjak  

* config/i386.i386.md (*zero_extendsidi2): Do not penalize
non-interunit SSE move alternatives with '?'.
(zero-extendsidi peephole2): New peephole to skip intermediate
general register in SSE zero-extend sequence.

testsuite/ChangeLog:

2017-05-15  Uros Bizjak  

* gcc.target/i386/pr80425-1.c: New test.
* gcc.target/i386/pr80425-2.c: Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 248065)
+++ config/i386/i386.md (working copy)
@@ -3762,10 +3762,10 @@
 
 (define_insn "*zero_extendsidi2"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,?*x,?*x,?*v,*r")
+   "=r,?r,?o,r   ,o,?*Ym,?!*y,?r ,?r,?*Yi,*x,*x,*v,*r")
(zero_extend:DI
 (match_operand:SI 1 "x86_64_zext_operand"
-   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m  , *x, *v,*k")))]
+   "0 ,rm,r ,rmWz,0,r   ,m   ,*Yj,*x,r   ,m ,*x,*v,*k")))]
   ""
 {
   switch (get_attr_type (insn))
@@ -3885,6 +3885,15 @@
(set (match_dup 4) (const_int 0))]
   "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);")
 
+(define_peephole2
+  [(set (match_operand:DI 0 "general_reg_operand")
+   (zero_extend:DI (match_operand:SI 1 "nonimmediate_gr_operand")))
+   (set (match_operand:DI 2 "sse_reg_operand") (match_dup 0))]
+  "TARGET_64BIT && TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC
+   && peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 2)
+   (zero_extend:DI (match_dup 1)))])
+
 (define_mode_attr kmov_isa
   [(QI "avx512dq") (HI "avx512f") (SI "avx512bw") (DI "avx512bw")])
 
Index: testsuite/gcc.target/i386/pr80425-1.c
===
--- testsuite/gcc.target/i386/pr80425-1.c   (nonexistent)
+++ testsuite/gcc.target/i386/pr80425-1.c   (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mtune=intel" } */
+
+#include 
+
+__m512i
+f1 (__m512i x, int a)
+{
+  return _mm512_srai_epi32 (x, a);
+}
+
+/* { dg-final { scan-assembler-times "movd\[ \\t\]+\[^\n\]*%xmm" 1 } } */
Index: testsuite/gcc.target/i386/pr80425-2.c
===
--- testsuite/gcc.target/i386/pr80425-2.c   (nonexistent)
+++ testsuite/gcc.target/i386/pr80425-2.c   (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mtune=intel" } */
+
+#include 
+
+extern int a;
+
+__m512i
+f1 (__m512i x)
+{
+  return _mm512_srai_epi32 (x, a);
+}
+
+/* { dg-final { scan-assembler-times "movd\[ \\t\]+\[^\n\]*%xmm" 1 } } */


Re: dejagnu version update?

2017-05-15 Thread Mike Stump
On May 15, 2017, at 1:06 AM, Richard Biener  wrote:
> 
> Both SLE-11 and SLE-12 use dejagnu 1.4.4, so does openSUSE Leap 42.[12].
> Tumbleweed uses 1.6 so new SLE will inherit that.  But I still do all
> of my testing on systems with just dejagnu 1.4.4.

So dejagnu is independent of most things and downloads and installs in seconds, 
upgrading it shouldn't pose a problem for anyone that can build gcc.

That said, a little surprising that SLE is lagging everyone else so hard.  
Looking at the 42.2 EOL plans, and that would put switching degagnu versions at 
around 13 months from now, if we waited.

So, how much would you mind, for trunk to require a newer a dejagnu?  If just a 
little, I'm inclined to not wait and support updating now.  If please god no, 
then I don't see the harm in waiting 13 months.  Leap 42.3 is out in 3 months, 
so the sooner update time would be just 3 months.  Could you jump to Leap 42.3 
at that time?



Re: [PATCH, rs6000] Fold vector logicals in GIMPLE

2017-05-15 Thread David Edelsohn
On Mon, May 15, 2017 at 9:09 AM, Will Schmidt  wrote:
> On Sat, 2017-05-13 at 18:03 -0700, David Edelsohn wrote:
>> On Thu, May 11, 2017 at 3:09 PM, Segher Boessenkool
>>  wrote:
>> > On Thu, May 11, 2017 at 02:36:26PM -0500, Will Schmidt wrote:
>> >> On Thu, 2017-05-11 at 14:15 -0500, Segher Boessenkool wrote:
>> >> > Hi!
>> >> >
>> >> > On Thu, May 11, 2017 at 10:53:33AM -0500, Will Schmidt wrote:
>> >> > > Add handling for early expansion of vector locical operations in 
>> >> > > gimple.
>> >> > > Specifically: vec_and, vec_andc, vec_or, vec_xor, vec_orc, vec_nand.
>> >> >
>> >> > You also handle nor (except in the changelog).  But what about eqv?
>> >>
>> >> Right, in my excitement I lost my 'vec_nor', that one should be
>> >> mentioned as well.
>> >>
>> >> vec_eqv() I have as a later patch in my series, it will be showing up
>> >> once this first bunch are in.
>> >
>> > Ah cool -- fine with the changelog fix then.  Thanks!
>>
>> Will,
>>
>> All of the testcases are failing on AIX.  Most are direct fails, but
>> some are complaining about implicit declaration of a function.
>>
>> I thought that we had determined the correct gcc testsuite target
>> selectors.  Something is not correct with the new tests.
>>
>> The errors about undeclared function are:
>>
>> FAIL: gcc.target/powerpc/fold-vec-div-float.c (test for excess errors)
>>
>> Excess errors:
>>
>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
>>
>> 13:10: warning: implicit declaration of function 'vec_div'; did you
>> mean 'vec_dss'? [-Wimplicit-function-declaration]
>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
>> 13:3: error: AltiVec argument passed to unprototyped function
>>
>> and
>>
>> FAIL: gcc.target/powerpc/fold-vec-div-floatdouble.c (test for excess errors)
>>
>> Excess errors:
>>
>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-floatdouble.c:10:8:
>> error: expected '=', ',', ';', 'asm' or '__attribute__' before
>> 'double'
>>
>> Would you please look into this and fix it?
>
> Yes.
>
> I've started a checkout on gcc119.  Is that the right environment I
> should poke around in, or is there a preferred or recommended
> alternative?

That is the right environment, but it will take a long time.  You
should be able to reason about it directly by looking at the failing
line.

Thanks, David


Re: Default std::vector default and move constructor

2017-05-15 Thread Marc Glisse

On Mon, 15 May 2017, François Dumont wrote:

   I also added some optimizations. Especially replacement of std::fill with 
calls to __builtin_memset. Has anyone ever proposed to optimize std::fill in 
such a way ? It would require a test on the value used to fill the range but 
it might worth this additional runtime check, no ?


Note that with -O3, gcc recognizes the pattern in std::fill and generates 
a call to memset (there is a bit too much extra code around the memset, 
but a couple match.pd transformations should fix that). That doesn't mean 
we can't save it the work. If you want to save the runtime check, there is 
always __builtin_constant_p...


The __fill_bvector part of the fill overload for vector could do 
with some improvements as well. Looping is unnecessary, one just needs to 
produce the right mask and and or or with it, that shouldn't take more 
than 4 instructions or so.


There was a time when I suggested overloading std::count and std::find in 
order to use __builtin_popcount, etc. But from what I've seen of committee 
discussions, I expect that there will be specialized algorithms (possibly 
member functions) eventually, making the overload less useful.


--
Marc Glisse


[PATCH, committed] Fix PR fortran/80752

2017-05-15 Thread Steve Kargl
I've committed the attached patch.

2017-05-15  Steven G. Kargl  

PR fortran/80752
* expr.c (gfc_generate_initializer):  If type conversion fails,
check for error and return NULL.

2017-05-15  Steven G. Kargl  

PR fortran/80752
gfortran.dg/pr80752.f90: New test.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow
Index: gcc/fortran/expr.c
===
--- gcc/fortran/expr.c	(revision 248066)
+++ gcc/fortran/expr.c	(working copy)
@@ -4395,7 +4395,12 @@ gfc_generate_initializer (gfc_typespec *
 	  if ((comp->ts.type != tmp->ts.type
 	   || comp->ts.kind != tmp->ts.kind)
 	  && !comp->attr.pointer && !comp->attr.proc_pointer)
-	gfc_convert_type_warn (ctor->expr, &comp->ts, 2, false);
+	{
+	  bool val;
+	  val = gfc_convert_type_warn (ctor->expr, &comp->ts, 1, false);
+	  if (val == false)
+		return NULL;
+	}
 	}
 
   if (comp->attr.allocatable
Index: gcc/testsuite/gfortran.dg/pr80752.f90
===
--- gcc/testsuite/gfortran.dg/pr80752.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr80752.f90	(working copy)
@@ -0,0 +1,20 @@
+! { dg-do compile }
+! PR fortran/80752
+module exchange_utils
+
+  implicit none
+
+  integer, parameter, public :: knd = 8
+
+  type, private :: a
+ logical :: add_vs98 = 0.0_knd ! { dg-error "Can't convert" }
+  end type a
+
+  type, private :: x_param_t
+ type(a) :: m05_m06
+  end type x_param_t
+
+  type(x_param_t), public, save :: x_param
+
+end module exchange_utils
+


[C++ PATCH] push_namespace cleanup

2017-05-15 Thread Nathan Sidwell
This cleanup patch from the modules branch fixes pr 79369, where we 
would accept inlining of an already existing namespace.


I changed push_namespace to return a count of the depth pushed, because 
I also have a fix for DR 2061, where we can end up pushing more than one 
namespace.  That'll be applied later.


nathan
--
Nathan Sidwell
2017-05-15  Nathan Sidwell  

	gcc/cp/
	PR c++/79369
	* cp-tree.h (DECL_NAMESPACE_INLINE_P): New.
	* name-lookup.h (push_namespace): Return int, add make_inline arg.
	* name-lookup.c (push_namespace): Deal with inline directly.
	Return pushed count.
	* parser.c (cp_parser_namespace_definition): Adjust for
	push_namespace change.

	gcc/testsuite/
	* g++.dg/cpp0x/pr65558.C: Adjust diagnostic location.
	* g++.dg/cpp0x/pr79369.C: New.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 248066)
+++ cp/cp-tree.h	(working copy)
@@ -333,6 +333,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   FOLD_EXPR_MODOP_P (*_FOLD_EXPR)
   IF_STMT_CONSTEXPR_P (IF_STMT)
   TEMPLATE_TYPE_PARM_FOR_CLASS (TEMPLATE_TYPE_PARM)
+  DECL_NAMESPACE_INLINE_P (in NAMESPACE_DECL)
1: IDENTIFIER_VIRTUAL_P (in IDENTIFIER_NODE)
   TI_PENDING_TEMPLATE_FLAG.
   TEMPLATE_PARMS_FOR_INLINE.
@@ -2916,6 +2917,10 @@ struct GTY(()) lang_decl {
 #define LOCAL_CLASS_P(NODE)\
   (decl_function_context (TYPE_MAIN_DECL (NODE)) != NULL_TREE)
 
+/* Whether the namepace is an inline namespace.  */
+#define DECL_NAMESPACE_INLINE_P(NODE) \
+  TREE_LANG_FLAG_0 (NAMESPACE_DECL_CHECK (NODE))
+
 /* For a NAMESPACE_DECL: the list of using namespace directives
The PURPOSE is the used namespace, the value is the namespace
that is the common ancestor.  */
Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 248066)
+++ cp/name-lookup.c	(working copy)
@@ -6441,107 +6441,112 @@ pop_from_top_level (void)
   timevar_cond_stop (TV_NAME_LOOKUP, subtime);
 }
 
-/* Push into the scope of the NAME namespace.  If NAME is NULL_TREE, then we
-   select a name that is unique to this compilation unit.  Returns FALSE if
-   pushdecl fails, TRUE otherwise.  */
+/* Push into the scope of the NAME namespace.  If NAME is NULL_TREE,
+   then we enter an anonymous namespace.  If MAKE_INLINE is true, then
+   we create an inline namespace (it is up to the caller to check upon
+   redefinition). Return the number of namespaces entered.  */
 
-bool
-push_namespace (tree name)
+int
+push_namespace (tree name, bool make_inline)
 {
-  tree d = NULL_TREE;
-  bool need_new = true;
-  bool implicit_use = false;
-  bool anon = !name;
-
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
+  int count = 0;
 
   /* We should not get here if the global_namespace is not yet constructed
  nor if NAME designates the global namespace:  The global scope is
  constructed elsewhere.  */
   gcc_assert (global_namespace != NULL && name != global_identifier);
 
-  if (anon)
-{
-  name = anon_identifier;
-  d = get_namespace_binding (current_namespace, name);
-  if (d)
-	/* Reopening anonymous namespace.  */
-	need_new = false;
-  implicit_use = true;
-}
-  else
+  if (!name)
+name = anon_identifier;
+
+  /* Check whether this is an extended namespace definition.  */
+  tree ns = get_namespace_binding (current_namespace, name);
+  if (ns && TREE_CODE (ns) == NAMESPACE_DECL)
 {
-  /* Check whether this is an extended namespace definition.  */
-  d = get_namespace_binding (current_namespace, name);
-  if (d != NULL_TREE && TREE_CODE (d) == NAMESPACE_DECL)
+  if (tree dna = DECL_NAMESPACE_ALIAS (ns))
 	{
-	  tree dna = DECL_NAMESPACE_ALIAS (d);
-	  if (dna)
- 	{
-	  /* We do some error recovery for, eg, the redeclaration
-		 of M here:
-
-		 namespace N {}
-		 namespace M = N;
-		 namespace M {}
-
-		 However, in nasty cases like:
-
-		 namespace N
-		 {
-		   namespace M = N;
-		   namespace M {}
-		 }
-
-		 we just error out below, in duplicate_decls.  */
-	  if (NAMESPACE_LEVEL (dna)->level_chain
-		  == current_binding_level)
-		{
-		  error ("namespace alias %qD not allowed here, "
-			 "assuming %qD", d, dna);
-		  d = dna;
-		  need_new = false;
-		}
+	  /* We do some error recovery for, eg, the redeclaration of M
+	 here:
+
+	 namespace N {}
+	 namespace M = N;
+	 namespace M {}
+
+	 However, in nasty cases like:
+
+	 namespace N
+	 {
+	   namespace M = N;
+	   namespace M {}
+	 }
+
+	 we just error out below, in duplicate_decls.  */
+	  if (NAMESPACE_LEVEL (dna)->level_chain == current_binding_level)
+	{
+	  error ("namespace alias %qD not allowed here, "
+		 "assuming %qD", ns, dna);
+	  ns = dna;
 	}
 	  else
-	need_new = false;
+	ns = NULL_TREE;
 	}
 }
+  else
+ns = NULL_TREE;
 
-  if (need_new)
+  bool new_ns = false;
+  if (!ns)
 {
-  /* Make a n

Re: [PATCH, rs6000] Fold vector logicals in GIMPLE

2017-05-15 Thread David Edelsohn
On Mon, May 15, 2017 at 12:24 PM, David Edelsohn  wrote:
> On Mon, May 15, 2017 at 9:09 AM, Will Schmidt  
> wrote:
>> On Sat, 2017-05-13 at 18:03 -0700, David Edelsohn wrote:
>>> On Thu, May 11, 2017 at 3:09 PM, Segher Boessenkool
>>>  wrote:
>>> > On Thu, May 11, 2017 at 02:36:26PM -0500, Will Schmidt wrote:
>>> >> On Thu, 2017-05-11 at 14:15 -0500, Segher Boessenkool wrote:
>>> >> > Hi!
>>> >> >
>>> >> > On Thu, May 11, 2017 at 10:53:33AM -0500, Will Schmidt wrote:
>>> >> > > Add handling for early expansion of vector locical operations in 
>>> >> > > gimple.
>>> >> > > Specifically: vec_and, vec_andc, vec_or, vec_xor, vec_orc, vec_nand.
>>> >> >
>>> >> > You also handle nor (except in the changelog).  But what about eqv?
>>> >>
>>> >> Right, in my excitement I lost my 'vec_nor', that one should be
>>> >> mentioned as well.
>>> >>
>>> >> vec_eqv() I have as a later patch in my series, it will be showing up
>>> >> once this first bunch are in.
>>> >
>>> > Ah cool -- fine with the changelog fix then.  Thanks!
>>>
>>> Will,
>>>
>>> All of the testcases are failing on AIX.  Most are direct fails, but
>>> some are complaining about implicit declaration of a function.
>>>
>>> I thought that we had determined the correct gcc testsuite target
>>> selectors.  Something is not correct with the new tests.
>>>
>>> The errors about undeclared function are:
>>>
>>> FAIL: gcc.target/powerpc/fold-vec-div-float.c (test for excess errors)
>>>
>>> Excess errors:
>>>
>>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
>>>
>>> 13:10: warning: implicit declaration of function 'vec_div'; did you
>>> mean 'vec_dss'? [-Wimplicit-function-declaration]
>>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-float.c:
>>> 13:3: error: AltiVec argument passed to unprototyped function
>>>
>>> and
>>>
>>> FAIL: gcc.target/powerpc/fold-vec-div-floatdouble.c (test for excess errors)
>>>
>>> Excess errors:
>>>
>>> /nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-div-floatdouble.c:10:8:
>>> error: expected '=', ',', ';', 'asm' or '__attribute__' before
>>> 'double'
>>>
>>> Would you please look into this and fix it?
>>
>> Yes.
>>
>> I've started a checkout on gcc119.  Is that the right environment I
>> should poke around in, or is there a preferred or recommended
>> alternative?
>
> That is the right environment, but it will take a long time.  You
> should be able to reason about it directly by looking at the failing
> line.

fold-vec-add-1.c expands "vector signed char"

__attribute__((altivec(vector__))) unsigned char
test6 (__attribute__((altivec(vector__))) unsigned char x,
__attribute__((altivec(vector__))) unsigned char y)
{
  return
# 43 
"/nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c"
3 4
__builtin_vec_add
# 43 
"/nasfarm/edelsohn/src/src/gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c"
(x, y);
}

but

fold-vec-div-float.c  does not expand "vector double"

vector double
test2 (vector double x, vector double y)
{
  return vec_div (x, y);
}

The fold-vec-div-float.c testcase only requires altivec

/* { dg-require-effective-target powerpc_altivec_ok } */

but float div is not an altivec feature.  The level of VSX is not
enabled on the version of AIX being tested.

The testcase selector needs to be updated for the correct feature.

Please re-check all of the recent testcases and use the correct
feature selector.

Thanks, David


[ping] PR78736: New C warning -Wenum-conversion

2017-05-15 Thread Prathamesh Kulkarni
Hi,
I would like to ping this patch for review:
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00775.html

Thanks,
Prathamesh


Re: {PATCH] New C++ warning -Wcatch-value

2017-05-15 Thread Martin Sebor

So how about the following then? I stayed with the catch part and added
a parameter to the warning to let the user decide on the warnings she/he
wants to get: -Wcatch-value=n.
-Wcatch-value=1 only warns for polymorphic classes that are caught by
value (to avoid slicing), -Wcatch-value=2 warns for all classes that
are caught by value (to avoid copies). And finally -Wcatch-value=3
warns for everything not caught by reference to find typos (like pointer
instead of reference) and bad coding practices.


It seems reasonable to me.  I'm not too fond of multi-level
warnings since few users take advantage of anything but the
default, but this case is simple and innocuous enough that
I don't think it can do harm.



Bootstrapped and regtested on x86_64-pc-linux-gnu.
OK for trunk?

If so, would it make sense to add -Wcatch-value=1 to -Wextra or even -Wall?
I would do this in a seperate patch, becuase I haven't checked what that
would mean for the testsuite.


I can't think of a use case for polymorphic slicing that's not
harmful so unless there is a common one that escapes me, I'd say
-Wall.

What are your thoughts on enhancing the warning to also handle
the rethrow case?

Also, it seems that a similar warning would be useful even beyond
catch handlers, to help detect slicing when passing arguments to
functions by value.  Especially in code that mixes OOP with the
STL (or other template libraries).  Have you thought about tackling
that at some point as well?

Martin



Regards,
Volker


2017-05-13  Volker Reichelt  

* doc/invoke.texi (-Wcatch-value=): Document new warning option.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 248004)
+++ gcc/doc/invoke.texi (working copy)
@@ -265,7 +265,7 @@
 -Wno-builtin-declaration-mismatch @gol
 -Wno-builtin-macro-redefined  -Wc90-c99-compat  -Wc99-c11-compat @gol
 -Wc++-compat  -Wc++11-compat  -Wc++14-compat  -Wcast-align  -Wcast-qual  @gol
--Wchar-subscripts -Wchkp  -Wclobbered  -Wcomment  @gol
+-Wchar-subscripts  -Wchkp  -Wcatch-value=@var{n}  -Wclobbered  -Wcomment  @gol
 -Wconditionally-supported  @gol
 -Wconversion  -Wcoverage-mismatch  -Wno-cpp  -Wdangling-else  -Wdate-time @gol
 -Wdelete-incomplete @gol
@@ -5832,6 +5832,14 @@
 literals to @code{char *}.  This warning is enabled by default for C++
 programs.

+@item -Wcatch-value=@var{n} @r{(C++ and Objective-C++ only)}
+@opindex Wcatch-value
+Warn about catch handlers that do not catch via reference.
+With @option{-Wcatch-value=1} warn about polymorphic class types that
+are caught by value. With @option{-Wcatch-value=2} warn about all class
+types that are caught by value. With @option{-Wcatch-value=3} warn about
+all types that are not caught by reference.
+
 @item -Wclobbered
 @opindex Wclobbered
 @opindex Wno-clobbered
===

2017-05-13  Volker Reichelt  

* c.opt (Wcatch-value=): New C++ warning flag.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 248004)
+++ gcc/c-family/c.opt  (working copy)
@@ -388,6 +388,10 @@
 C ObjC C++ ObjC++ Var(warn_cast_qual) Warning
 Warn about casts which discard qualifiers.

+Wcatch-value=
+C++ ObjC++ Var(warn_catch_value) Warning Joined RejectNegative UInteger
+Warn about catch handlers of non-reference type.
+
 Wchar-subscripts
 C ObjC C++ ObjC++ Var(warn_char_subscripts) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
 Warn about subscripts whose type is \"char\".
===

2017-05-13  Volker Reichelt  

* semantics.c (finish_handler_parms): Warn about non-reference type
catch handlers.

Index: gcc/cp/semantics.c
===
--- gcc/cp/semantics.c  (revision 248004)
+++ gcc/cp/semantics.c  (working copy)
@@ -1321,7 +1321,28 @@
}
 }
   else
-type = expand_start_catch_block (decl);
+{
+  type = expand_start_catch_block (decl);
+  if (warn_catch_value
+ && type != NULL_TREE
+ && type != error_mark_node
+ && TREE_CODE (TREE_TYPE (decl)) != REFERENCE_TYPE)
+   {
+ tree orig_type = TREE_TYPE (decl);
+ if (CLASS_TYPE_P (orig_type))
+   {
+ if (TYPE_POLYMORPHIC_P (orig_type))
+   warning (OPT_Wcatch_value_,
+"catching polymorphic type %q#T by value", orig_type);
+ else if (warn_catch_value > 1)
+   warning (OPT_Wcatch_value_,
+"catching type %q#T by value", orig_type);
+   }
+ else if (warn_catch_value > 2)
+   warning (OPT_Wcatch_value_,
+"catching non-reference type %q#T", orig_type);
+   }
+}
   HANDLER_TYPE (handler) = type;
 }

===


Re: [Patch, fortran] PR80554 [f08] variable redefinition in submodule

2017-05-15 Thread Steve Kargl
On Mon, May 15, 2017 at 12:10:42PM +0100, Paul Richard Thomas wrote:
> The attached bootstraps and regtests on FC23/x86_64 - OK for trunk and
> later for 7-branch?
> 
> The comment in the patch and the ChangeLog are sufficiently clear that
> no further explanation is needed here.
> 
> Cheers
> 
> Paul
> 
> 2017-05-15  Paul Thomas  
> 
> PR fortran/80554
> * decl.c (build_sym): In a submodule allow overriding of host
> associated symbols from the ancestor module with a new
> declaration.
> 
> 2017-05-15  Paul Thomas  
> 
> PR fortran/80554
> * gfortran.dg/submodule_29.f08: New test.

OK.

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


[PATCH, rs6000] gcc mainline, add builtin support for vec_bperm(), vec_mule() and vec_mulo and vec_sldw() builtins

2017-05-15 Thread Carl E. Love
GCC Maintainers:

This patch adds support for the various vec_bperm(), vec_mule() and
vec_mulo and vec_sldw() builtins.

The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
with no regressions.

Is the patch OK for gcc mainline?

  Carl Love
--


gcc/ChangeLog:

2017-05-15  Carl Love  

   * config/rs6000/rs6000-c: Add support for built-in functions
   vector unsigned long long vec_bperm (vector unsigned long long,
vector unsigned char)
   vector signed long long vec_mule (vector signed int,
 vector signed int)
   vector unsigned long long vec_mule (vector unsigned int,
   vector unsigned int)
   vector signed long long vec_mulo (vector signed int,
 vector signed int)
   vector unsigned long long vec_mulo (vector unsigned int,
   vector unsigned int)
   vector signed char vec_sldw (vector signed char,
vector signed char,
const int)
   vector unsigned char vec_sldw (vector unsigned char,
  vector unsigned char,
  const int)
   vector signed short vec_sldw (vector signed short,
 vector signed short,
 const int)
   vector unsigned short vec_sldw (vector unsigned short,
   vector unsigned short,
   const int)
   vector signed int vec_sldw (vector signed int,
   vector signed int,
   const int)
   vector unsigned int vec_sldw (vector unsigned int,
 vector unsigned int,
 const int)
   vector signed long long vec_sldw (vector signed long long,
 vector signed long long,
 const int)
   vector unsigned long long vec_sldw (vector unsigned long long,
   vector unsigned long long,
   const int)
   * config/rs6000/rs6000-c: Add support for built-in functions
   * config/rs6000/rs6000-builtin.def: Add definition for SLDW.
   * config/rs6000/altivec.h: Add defintion for vec_sldw.
   * doc/extend.texi: Update the built-in documentation for the
 new built-in functions.

gcc/testsuite/ChangeLog:

2017-05-15  Carl Love  

   * gcc.target/powerpc/builtins-3.c: New vec_mule, vec_mulo test cases.
   * gcc.target/powerpc/builtins-3-p8.c: Add tests for the new Power 8
 built-ins to the test suite file.  Note, support for mradds exists
 but no test case exists.
   * gcc.target/powerpc/builtins-3-p9.c: Add tests for the new Power  9
 built-ins to the test suite file.
---
 gcc/config/rs6000/altivec.h  |  1 +
 gcc/config/rs6000/rs6000-builtin.def |  1 +
 gcc/config/rs6000/rs6000-c.c | 37 ++
 gcc/doc/extend.texi  | 32 
 gcc/testsuite/gcc.target/powerpc/builtins-3-p8.c | 37 +++---
 gcc/testsuite/gcc.target/powerpc/builtins-3-p9.c | 31 +++-
 gcc/testsuite/gcc.target/powerpc/builtins-3.c| 93 +++-
 7 files changed, 204 insertions(+), 28 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index c334d9f..c92bcce 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -247,6 +247,7 @@
 #define vec_sel __builtin_vec_sel
 #define vec_sl __builtin_vec_sl
 #define vec_sld __builtin_vec_sld
+#define vec_sldw __builtin_vsx_xxsldwi
 #define vec_sll __builtin_vec_sll
 #define vec_slo __builtin_vec_slo
 #define vec_splat __builtin_vec_splat
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 41186b1..ebe005a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1502,6 +1502,7 @@ BU_ALTIVEC_OVERLOAD_X (LVSR, "lvsr")
 BU_ALTIVEC_OVERLOAD_X (MUL,   "mul")
 BU_ALTIVEC_OVERLOAD_X (PROMOTE,   "promote")
 BU_ALTIVEC_OVERLOAD_X (SLD,   "sld")
+BU_ALTIVEC_OVERLOAD_X (SLDW,  "sldw")
 BU_ALTIVEC_OVERLOAD_X (SPLAT, "splat")
 BU_ALTIVEC_OVERLOAD_X (SPLATS,"splats")
 BU_ALTIVEC_OVERLOAD_X (ST,"st")
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index a0536d6..8039814 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2182,6 +2182,11 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, 0 },
   { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESH,
 RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 

[patch, libgfortran] [7/8 Regression] Crash of runtime gfortran library during integer transformation

2017-05-15 Thread Jerry DeLisle

Hi all,

Crash is a misnomer on this PR [aside: People see the backtrace and assume]

This patch fixes the problem by correctly detecting the EOR condition for 
internal units. The previous check in read_sf_internal was wrong, relying 
probably on uninitialized memory as can be seen by the still open PR78881. 
Removing the bad hunk fixes the regression here and the new code lets 
dtio_26.f90 pass as expected.


Regression tested on x86_64. New test case will be added.

OK for trunk? Will back port in a few days to 7.

I will also have Rainer verify it fixes the problem on sparc (78881)

Regards,

Jerry

2017-05-15  Jerry DeLisle  

PR libgfortran/80727
* transfer.c (read_sf_internal): Remove bogus code to detect EOR.
(read_block_form): For internal units, generate EOR if no more
bytes left in unit and we are trying to read with ADVANCE='NO'.






diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index f16d8c55..928a448f 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -272,12 +272,6 @@ read_sf_internal (st_parameter_dt *dtp, int *length)
   return NULL;
 }
 
-  if (base && *base == 0)
-{
-  generate_error (&dtp->common, LIBERROR_EOR, NULL);
-  return NULL;
-}
-
   dtp->u.p.current_unit->bytes_left -= *length;
 
   if (((dtp->common.flags & IOPARM_DT_HAS_SIZE) != 0) ||
@@ -470,11 +464,24 @@ read_block_form (st_parameter_dt *dtp, int *nbytes)
 		}
 	}
 
-	  if (unlikely (dtp->u.p.current_unit->bytes_left == 0
-	  && !is_internal_unit(dtp)))
+	  if (is_internal_unit(dtp))
 	{
-	  hit_eof (dtp);
-	  return NULL;
+	  if (*nbytes > 0 && dtp->u.p.current_unit->bytes_left == 0)
+	{
+		  if (dtp->u.p.advance_status == ADVANCE_NO)
+		{
+		  generate_error (&dtp->common, LIBERROR_EOR, NULL);
+		  return NULL;
+		}
+		}
+	}
+	  else
+	{
+	  if (unlikely (dtp->u.p.current_unit->bytes_left == 0))
+		{
+		  hit_eof (dtp);
+		  return NULL;
+		}
 	}
 
 	  *nbytes = dtp->u.p.current_unit->bytes_left;
! { dg-do run }
! PR80727 Crash of runtime gfortran library during integer transformation
! Note: before the patch this was giving an incorrect EOR error on READ.
programgfortran_710_io_bug
  character  str*4
  integer*4  i4
  str =''
  i = 256
  write(str,fmt='(a)') i
  i = 0
  read ( unit=str(1:4), fmt='(a)' ) i4
  if (i4.ne.256) call abort
end  program  gfortran_710_io_bug 


Re: [C++ PATCH] push_namespace cleanup

2017-05-15 Thread Nathan Sidwell

On 05/15/2017 03:38 PM, Nathan Sidwell wrote:
This cleanup patch from the modules branch fixes pr 79369, where we 
would accept inlining of an already existing namespace.


missed this testcase tweak.  now committed.

--
Nathan Sidwell
2017-05-15  Nathan Sidwell  

	PR c++/79369
	* g++.dg/cpp1z/nested-namespace-def1.C: Adjust.

Index: testsuite/g++.dg/cpp1z/nested-namespace-def1.C
===
--- testsuite/g++.dg/cpp1z/nested-namespace-def1.C	(revision 248066)
+++ testsuite/g++.dg/cpp1z/nested-namespace-def1.C	(working copy)
@@ -11,7 +11,7 @@ A::B::C::T::U::V::Y y;
 
 inline namespace D::E {} // { dg-error "cannot be inline" }
 
-namespace F::G:: {} // { dg-error "nested identifier required" }
+namespace F::G:: {} // { dg-error "namespace name required" }
 
 namespace G __attribute ((visibility ("default"))) ::H {} // { dg-error "cannot have attributes" }
 


Re: [Patch, fortran] PR80554 [f08] variable redefinition in submodule

2017-05-15 Thread Jerry DeLisle

On 05/15/2017 04:10 AM, Paul Richard Thomas wrote:

The attached bootstraps and regtests on FC23/x86_64 - OK for trunk and
later for 7-branch?

The comment in the patch and the ChangeLog are sufficiently clear that
no further explanation is needed here.



Looks OK Paul, thanks,

Jerry


Re: [PATCH,AIX] Enable Stack Unwinding on AIX

2017-05-15 Thread David Edelsohn
Please do not email my IBM Notes address with patches.  Please copy
this Gmail address for patch submissions.

>   * libgcc/config/rs6000/aix-unwind.h : Implements stack unwinding on AIX.

This ChangeLog entry clearly is wrong because aix-unwind.h already
implements ppc_aix_fallback_frame_state.  The ChangeLog entry should
reference the exact function being modified and a useful comment about
how it is modified, e.g.,

* config/rs6000/aix-unwind.h (ppc_aix_fallback_frame_state): Add 64
bit support Add 32 bit support for AIX 6.1 and 7.2.

The ChangeLog file is in libgcc, so the file reference is wrong
because it should not use libgcc in the path.

How was this tested?

Thanks, David


Re: [PATCH][X86] Add missing xgetbv xsetbv intrinsics

2017-05-15 Thread Andi Kleen
"Koval, Julia"  writes:

> Hi,
>
> This patch add these missing intrinsics:
> _xsetbv
> _xgetbv

-march=native driver support for the CPUID bit seems to be missing.

-Andi


Re: [patch, libgfortran] [7/8 Regression] Crash of runtime gfortran library during integer transformation

2017-05-15 Thread Steve Kargl
On Mon, May 15, 2017 at 01:10:43PM -0700, Jerry DeLisle wrote:
> Hi all,
> 
> Crash is a misnomer on this PR [aside: People see the backtrace and assume]
> 
> This patch fixes the problem by correctly detecting the EOR condition for 
> internal units. The previous check in read_sf_internal was wrong, relying 
> probably on uninitialized memory as can be seen by the still open PR78881. 
> Removing the bad hunk fixes the regression here and the new code lets 
> dtio_26.f90 pass as expected.
> 
> Regression tested on x86_64. New test case will be added.
> 
> OK for trunk? Will back port in a few days to 7.
> 

No.  There are a number of other failures with your patch applied.

Running /home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/dg.exp ...
FAIL: gfortran.dg/read_3.f90   -O0  (test for excess errors)
FAIL: gfortran.dg/read_3.f90   -O1  (test for excess errors)
FAIL: gfortran.dg/read_3.f90   -O2  (test for excess errors)
FAIL: gfortran.dg/read_3.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gfortran.dg/read_3.f90   -O3 -g  (test for excess errors)
FAIL: gfortran.dg/read_3.f90   -Os  (test for excess errors)

=== gfortran Summary ===

# of expected passes6
# of unexpected failures6
/mnt/sgk/objx/gcc/gfortran  version 8.0.0 20170515 (experimental) (GCC) 

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [PATCH,AIX] Enable FFI Go Closure on AIX

2017-05-15 Thread David Edelsohn
This patch needs to be submitted to the libffi project.

Also, the ChangeLog needs to specify exactly what is being changed not
"Implement Go Closures".  The patch clearly touches existing parts of
the files that affect more than simply Go closures.

How was this tested?  libffi is used in many more places than Go, so
any changes need to be tested very carefully and thoroughly.  What are
the results for the libffi testsuite?  Have you tried building Python
with a version of libffi built with this patch?

Thanks, David


Re: [PATCH] [i386] Recompute the frame layout less often

2017-05-15 Thread Bernd Edlinger
On 05/15/17 03:39, Daniel Santos wrote:
> On 05/14/2017 11:31 AM, Bernd Edlinger wrote:
>> Hi Daniel,
>>
>> there is one thing I don't understand in your patch:
>> That is, it introduces a static value:
>>
>> /* Registers who's save & restore will be managed by stubs called from
>>  pro/epilogue.  */
>> static HARD_REG_SET GTY(()) stub_managed_regs;
>>
>> This seems to be set as a side effect of ix86_compute_frame_layout,
>> and depends on the register usage of the current function.
>> But values that depend on the current function need usually be
>> attached to cfun->machine, because the passes can run in parallel
>> unless I am completely mistaken, and the stub_managed_regs may
>> therefore be computed from a different function.
>>
>>
>> Bernd.
>
> I should add that if you want to run faster tests just on the ms to sysv
> abi code, you can use make RUNTESTFLAGS="ms-sysv.exp" check and then if
> that succeeds run the full testsuite.
>
> Daniel

Unfortunately I encounter a serious problem when my patch is used
ontop of your patch, Yes, the test suite ran without error, but then
I tried to trigger the warning and that tripped an ICE.
The reason is that cfun->machine->call_ms2sysv can be set to true
*after* reload_completed, which can be seen using the following
patch:

Index: i386.c
===
--- i386.c  (revision 248031)
+++ i386.c  (working copy)
@@ -29320,7 +29320,10 @@

/* Set here, but it may get cleared later.  */
if (TARGET_CALL_MS2SYSV_XLOGUES)
+  {
+   gcc_assert(!reload_completed);
cfun->machine->call_ms2sysv = true;
+  }
  }

if (vec_len > 1)


That assertion is triggered in this test case:

cat test.c
int test()
{
   __builtin_printf("test\n");
   return 0;
}

gcc -mabi=ms -mcall-ms2sysv-xlogues -fsplit-stack -c test.c
test.c: In function 'test':
test.c:5:1: internal compiler error: in ix86_expand_call, at 
config/i386/i386.c:29324
  }
  ^
0x13390a4 ix86_expand_call(rtx_def*, rtx_def*, rtx_def*, rtx_def*, 
rtx_def*, bool)
../../gcc-trunk/gcc/config/i386/i386.c:29324
0x1317494 ix86_expand_split_stack_prologue()
../../gcc-trunk/gcc/config/i386/i386.c:15920
0x162ba21 gen_split_stack_prologue()
../../gcc-trunk/gcc/config/i386/i386.md:12556
0x12f3f30 target_gen_split_stack_prologue
../../gcc-trunk/gcc/config/i386/i386.md:12325
0xb237b3 make_split_prologue_seq
../../gcc-trunk/gcc/function.c:5822
0xb23a08 thread_prologue_and_epilogue_insns()
../../gcc-trunk/gcc/function.c:5958
0xb24840 rest_of_handle_thread_prologue_and_epilogue
../../gcc-trunk/gcc/function.c:6428
0xb248c0 execute
../../gcc-trunk/gcc/function.c:6470
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


so, in ix86_expand_split_stack_prologue
we first call:
   ix86_finalize_stack_realign_flags ();
   ix86_compute_frame_layout (&frame);

and later:
   call_insn = ix86_expand_call (NULL_RTX, gen_rtx_MEM (QImode, fn),
 GEN_INT (UNITS_PER_WORD), constm1_rtx,
 pop, false);

which changes a flag with a huge impact on the frame layout, but there
is no absolutely no way how the frame layout can change once it is
finalized.


Any Thoughts?


Bernd.


Re: dejagnu version update?

2017-05-15 Thread Andreas Schwab
On Mai 15 2017, Mike Stump  wrote:

> That said, a little surprising that SLE is lagging everyone else so
> hard.

DejaGnu doesn't exactly have frequent releases.  Missing just one
release can easily put you more than 5 years behind.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[PATCH 0/3] Split off powerpcspe from rs6000 port

2017-05-15 Thread Segher Boessenkool
Hi!

As discussed before, here is a series to split powerpcspe from the
rs6000 port.  This series does not yet make any real changes to either
port: it is a copy of rs6000/ to powerpcspe/, with some renames and
some necessary changes to the port file, and slightly more involved
changes to config.gcc .

This was tested on powerpc64-linux {-m32,-m64}, and it was build-tested
on powerpc-linux-gnuspe (and the resulting compiler was tested to be
functional: it can build various Linux defconfigs for SPE systems).

I have tried to see how much the powerpcspe port can be simplified
after this, and found it can lose 80% of the code without big problems.
You may however not want all that, for example, I removed all 64-bit
support in that test.  Getting rid of all VMX/VSX support is a big part
of it already, as is removing "classic" floating point (and paired
single, and xilinx fpu, and all newer ISA features, etc.)

For the rs6000 port the low-hanging fruits are much more modest, only
5% or a bit more; but in pretty gnarly code.  For example, some current
pain points are the SPE ABI (for separate shrink-wrapping), and how
isel is handled.

This won't be the final series...  I have a few questions:

-- This uses powerpc-*-rtems*spe*; do we want powerpcspe-*-rtems*
   instead?  Or both?
-- This uses powerpc-wrs-vxworksspe; do we want powerpcspe-wrs-vxworks
   instead?  Both?  What about the ae and mils variants?
-- Does powerpc*-*-freebsd*spe* exist?
-- Does powerpc-*-netbsd*spe* exist?
-- Does powerpc-*-eabisim*spe* exist?
-- Does powerpcle-*-*spe* exist?

Also, testing is needed :-)  You can get better testing by removing
the rs6000/ directories completely, btw.; otherwise files from rs6000/
can accidentally be picked up instead of the corresponfing file from
powerpcspe/, which will currently work because there are no big
differences yet, but things will diverge later (and then break).


Segher


[PATCH 1/3] Copy rs6000/ to powerpcspe/

2017-05-15 Thread Segher Boessenkool
This is not a real patch, it is the result of this sh script:

  cp -R gcc/common/config/{rs6000,powerpcspe}
  cp -R gcc/config/{rs6000,powerpcspe}


Segher


[PATCH 2/3] Rename various powerpcspe port files

2017-05-15 Thread Segher Boessenkool
This is also not a real patch, it is the result of this sh script:

  mv gcc/common/config/powerpcspe/{rs6000,powerpcspe}-common.c
  mv gcc/config/powerpcspe/driver-{rs6000,powerpcspe}.c
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-builtin.def
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-c.c
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-cpus.def
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-linux.c
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-modes.def
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-opts.h
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-passes.def
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-protos.h
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}-tables.opt
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}.c
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}.h
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}.md
  mv gcc/config/powerpcspe/{rs6000,powerpcspe}.opt
  mv gcc/config/powerpcspe/t-{rs6000,powerpcspe}
  mv gcc/config/powerpcspe/x-{rs6000,powerpcspe}


Segher


[PATCH 3/3] Actual adjustments to make powerpcspe work

2017-05-15 Thread Segher Boessenkool
This is tested with powerpc-linux-gnuspe.  The other subtargets are
meant to work as well, but no doubt something is still broken.  I also
do not know if the port _actually_ works.

The main changes are to config.gcc; the rest is mostly filename
renames.  See the cover letter for what testing is still needed.

Cheers,


Segher

---
 gcc/config.gcc  | 49 ++---
 gcc/config.host |  4 ++
 gcc/config/powerpcspe/genopt.sh |  6 +--
 gcc/config/powerpcspe/powerpcspe-tables.opt |  2 +-
 gcc/config/powerpcspe/powerpcspe.c  | 36 ++---
 gcc/config/powerpcspe/powerpcspe.h  | 12 ++---
 gcc/config/powerpcspe/powerpcspe.opt|  2 +-
 gcc/config/powerpcspe/t-linux   |  2 +-
 gcc/config/powerpcspe/t-linux64 |  2 +-
 gcc/config/powerpcspe/t-powerpcspe  | 82 ++---
 gcc/config/powerpcspe/x-darwin  |  2 +-
 gcc/config/powerpcspe/x-darwin64|  2 +-
 gcc/config/powerpcspe/x-powerpcspe  |  2 +-
 13 files changed, 120 insertions(+), 83 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index e8aaf2d..95d01d1 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -442,6 +442,16 @@ nios2-*-*)
 nvptx-*-*)
cpu_type=nvptx
;;
+powerpc*-*-*spe*)
+   cpu_type=powerpcspe
+   extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h 
spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h"
+   case x$with_cpu in
+   
xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+   cpu_is_64bit=yes
+   ;;
+   esac
+   extra_options="${extra_options} g.opt fused-madd.opt 
powerpcspe/powerpcspe-tables.opt"
+   ;;
 powerpc*-*-*)
cpu_type=rs6000
extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h 
spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h"
@@ -2369,9 +2379,9 @@ powerpc-*-netbsd*)
extra_options="${extra_options} rs6000/sysv4.opt"
;;
 powerpc-*-eabispe*)
-   tm_file="${tm_file} dbxelf.h elfos.h freebsd-spec.h newlib-stdint.h 
rs6000/sysv4.h rs6000/eabi.h rs6000/e500.h rs6000/eabispe.h"
-   extra_options="${extra_options} rs6000/sysv4.opt"
-   tmake_file="rs6000/t-spe rs6000/t-ppccomm"
+   tm_file="${tm_file} dbxelf.h elfos.h freebsd-spec.h newlib-stdint.h 
${cpu_type}/sysv4.h ${cpu_type}/eabi.h ${cpu_type}/e500.h ${cpu_type}/eabispe.h"
+   extra_options="${extra_options} ${cpu_type}/sysv4.opt"
+   tmake_file="${cpu_type}/t-spe ${cpu_type}/t-ppccomm"
use_gcc_stdint=wrap
;;
 powerpc-*-eabisimaltivec*)
@@ -2409,11 +2419,27 @@ powerpc-*-eabi*)
tmake_file="rs6000/t-fprules rs6000/t-ppcgas rs6000/t-ppccomm"
use_gcc_stdint=wrap
;;
+powerpc-*-rtems*spe*)
+   tm_file="${tm_file} dbxelf.h elfos.h freebsd-spec.h newlib-stdint.h 
powerpcspe/sysv4.h powerpcspe/eabi.h powerpcspe/e500.h powerpcspe/rtems.h 
rtems.h"
+   extra_options="${extra_options} powerpcspe/sysv4.opt"
+   tmake_file="${tmake_file} powerpcspe/t-fprules powerpcspe/t-rtems 
powerpcspe/t-ppccomm"
+   ;;
 powerpc-*-rtems*)
tm_file="${tm_file} dbxelf.h elfos.h freebsd-spec.h newlib-stdint.h 
rs6000/sysv4.h rs6000/eabi.h rs6000/e500.h rs6000/rtems.h rtems.h"
extra_options="${extra_options} rs6000/sysv4.opt"
tmake_file="${tmake_file} rs6000/t-fprules rs6000/t-rtems 
rs6000/t-ppccomm"
;;
+powerpc*-*-linux*spe*)
+   tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h freebsd-spec.h 
powerpcspe/sysv4.h"
+   extra_options="${extra_options} powerpcspe/sysv4.opt"
+   tmake_file="${tmake_file} powerpcspe/t-fprules powerpcspe/t-ppccomm"
+   extra_objs="$extra_objs powerpcspe-linux.o"
+   maybe_biarch=
+   tm_file="${tm_file} powerpcspe/linux.h glibc-stdint.h"
+   tmake_file="${tmake_file} powerpcspe/t-ppcos powerpcspe/t-linux"
+   tm_file="${tm_file} powerpcspe/linuxspe.h powerpcspe/e500.h"
+   default_gnu_indirect_function=yes
+   ;;
 powerpc*-*-linux*)
tm_file="${tm_file} dbxelf.h elfos.h gnu-user.h freebsd-spec.h 
rs6000/sysv4.h"
extra_options="${extra_options} rs6000/sysv4.opt"
@@ -2501,6 +2527,13 @@ powerpc*-*-linux*)
;;
esac
;;
+powerpc-wrs-vxworksspe)
+   tm_file="${tm_file} elfos.h freebsd-spec.h powerpcspe/sysv4.h"
+   tmake_file="${tmake_file} powerpcspe/t-fprules powerpcspe/t-ppccomm 
powerpcspe/t-vxworks"
+   extra_options="${extra_options} powerpcspe/sysv4.opt"
+   extra_headers=ppc-asm.h
+   tm_file="${tm_file} vx-common.h vxworks.h powerpcspe/vxworks.h 
powerpcspe/e500.h"
+   ;;
 powerpc-wrs-vxworks|powerpc-wrs-vxworksae|powerpc-wrs-vxworksmils)
tm_file="${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h"
tmake_file="${tmake_file} rs6000/t-fprules rs6000/t-ppccomm 
rs6000/t-vx

Re: [patch, libgfortran] [7/8 Regression] Crash of runtime gfortran library during integer transformation

2017-05-15 Thread Steve Kargl
On Mon, May 15, 2017 at 01:33:55PM -0700, Steve Kargl wrote:
> On Mon, May 15, 2017 at 01:10:43PM -0700, Jerry DeLisle wrote:
> > Hi all,
> > 
> > Crash is a misnomer on this PR [aside: People see the backtrace and assume]
> > 
> > This patch fixes the problem by correctly detecting the EOR condition for 
> > internal units. The previous check in read_sf_internal was wrong, relying 
> > probably on uninitialized memory as can be seen by the still open PR78881. 
> > Removing the bad hunk fixes the regression here and the new code lets 
> > dtio_26.f90 pass as expected.
> > 
> > Regression tested on x86_64. New test case will be added.
> > 
> > OK for trunk? Will back port in a few days to 7.
> > 
> 
> No.  There are a number of other failures with your patch applied.
> 
> Running /home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/dg.exp ...
> FAIL: gfortran.dg/read_3.f90   -O0  (test for excess errors)
> FAIL: gfortran.dg/read_3.f90   -O1  (test for excess errors)
> FAIL: gfortran.dg/read_3.f90   -O2  (test for excess errors)
> FAIL: gfortran.dg/read_3.f90   -O3 -fomit-frame-pointer -funroll-loops 
> -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> FAIL: gfortran.dg/read_3.f90   -O3 -g  (test for excess errors)
> FAIL: gfortran.dg/read_3.f90   -Os  (test for excess errors)
> 

The failures are from an unexpected warning.

Executing on host: /mnt/sgk/objx/gcc/testsuite/gfortran2/../../gfortran 
-B/mnt/sgk/objx/gcc/testsuite/gfortran2/../../ 
-B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/ 
/home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/read_3.f90
-fno-diagnostics-show-caret -fdiagnostics-color=never-O0   -pedantic-errors 
 -B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libatomic/.libs 
-B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs  -lm  -o 
./read_3.exe(timeout = 300)
spawn -ignore SIGHUP /mnt/sgk/objx/gcc/testsuite/gfortran2/../../gfortran 
-B/mnt/sgk/objx/gcc/testsuite/gfortran2/../../ 
-B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/ 
/home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/read_3.f90 
-fno-diagnostics-show-caret -fdiagnostics-color=never -O0 -pedantic-errors 
-B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libgfortran/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libatomic/.libs 
-B/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs 
-L/mnt/sgk/objx/x86_64-unknown-freebsd12.0/./libquadmath/.libs -lm -o 
./read_3.exe

/home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/read_3.f90:6:11: Warning: GNU 
Extension: Nonstandard type declaration INTEGER*4 at (1)

FAIL: gfortran.dg/read_3.f90   -O0  (test for excess errors)
Excess errors:
/home/sgk/gcc/gccx/gcc/testsuite/gfortran.dg/read_3.f90:6:11: Warning: GNU 
Extension: Nonstandard type declaration INTEGER*4 at (1)

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [PATCH], PR target/80510, Optimize offsettable memory references on power7/power8

2017-05-15 Thread Segher Boessenkool
Hi,

On Fri, May 12, 2017 at 05:33:50PM -0400, Michael Meissner wrote:
> The problem is if the DImode, DFmode, and SFmode are allowed in Altivec
> registers before ISA 3.0, and the compiler wants to do an offsettable store.
> The compiler generates a move from an Altivec register to a traditional
> floating point register, and then the compiler generates the STFD or STFS
> instruction.
> 
> This code adds peephole2's that notices there is a move from an altivec
> regsiter to fpr register and store, it changes this load the offset into a 
> GPR,
> and do the indexed store from the Altivec register.  I also added code to do
> the reverse (notice if there is a load to a FPR register and copy it to an
> Altivec register) and use an indexed load.

Ok.

> I ran the Spec 2006 floating point suite with this patch, and the LBM 
> benchmark
> shows a nearly 3% gain with this patch, and there were no significant
> regressions.

Nice :-)

> Note, using peepholes are a quick way to fix the particular problem.  However,
> it would be nice long term to arrange things so the back end can tell the
> register allocator to load up the offset into a register, instead of doing the
> move/store.  I tried various modifications to secondary reload, but I wasn't
> able to get it to change behavor.

These peepholes are simple and look perfectly safe.  It would of course
be great if we wouldn't need them, if LRA was a bit smarter.

> +;; Optimize cases where we want to do a D-form load (register+offset) on
> +;; ISA 2.06/2.07 to an Altivec register, and the register allocator
> +;; has generated:
> +;;   load fpr
> +;;   move fpr->altivec

Maybe show here the actual machine instructions before and after the
peephole has been applied?

> +/* { dg-final { scan-assembler {\xsadddp\M} } } */
> +/* { dg-final { scan-assembler {\stxsdx\M}  } } */
> +/* { dg-final { scan-assembler-not {\mmfvsrd\M} } } */

You forgot the "m" (in "\m") in the first two of these.  (I wonder
how this worked, esp. the "\s" one?)

> +/* { dg-final { scan-assembler {\xsaddsp\M}  } } */
> +/* { dg-final { scan-assembler {\stxsspx\M}  } } */
> +/* { dg-final { scan-assembler-not {\mmfvsrd\M}  } } */
> +/* { dg-final { scan-assembler-not {\mmfvsrwz\M} } } */

And again.

Okay for trunk with that fixed.  Also okay for 7 (after a delay).
Thanks!


Segher


Re: [PATCH, rs6000] gcc mainline, add builtin support for vec_bperm(), vec_mule() and vec_mulo and vec_sldw() builtins

2017-05-15 Thread Segher Boessenkool
Hi Carl,

On Mon, May 15, 2017 at 01:08:03PM -0700, Carl E. Love wrote:
>* config/rs6000/rs6000-c: Add support for built-in functions
>* config/rs6000/rs6000-builtin.def: Add definition for SLDW.
>* config/rs6000/altivec.h: Add defintion for vec_sldw.
>* doc/extend.texi: Update the built-in documentation for the
>  new built-in functions.

That last line should not be indented.

>* gcc.target/powerpc/builtins-3.c: New vec_mule, vec_mulo test cases.
>* gcc.target/powerpc/builtins-3-p8.c: Add tests for the new Power 8
>  built-ins to the test suite file.  Note, support for mradds exists
>  but no test case exists.
>* gcc.target/powerpc/builtins-3-p9.c: Add tests for the new Power  9
>  built-ins to the test suite file.

Same for all lines without * here.  And you have two spaces before 9.

Patch itself seems fine, thanks!


Segher


Re: [PATCH] lto-wrapper.c (copy_file): Fix resource leaks

2017-05-15 Thread Jeff Law

On 05/14/2017 04:00 AM, Sylvestre Ledru wrote:

Add missing fclose
CID 1407987, 1407986

S



0005-2017-05-14-Sylvestre-Ledru-sylvestre-debian.org.patch


 From d255827a64012fb81937d6baa8534eabecf9b735 Mon Sep 17 00:00:00 2001
From: Sylvestre Ledru
Date: Sun, 14 May 2017 11:37:37 +0200
Subject: [PATCH 5/5] 2017-05-14  Sylvestre Ledru

* lto-wrapper.c (copy_file): Fix resource leaks
   CID 1407987, 1407986

Doesn't this still leak in the cases were we call fatal_error?

Jeff


Re: [PATCH] plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)

2017-05-15 Thread Jeff Law

On 05/14/2017 04:40 AM, Trevor Saunders wrote:

On Sun, May 14, 2017 at 11:59:40AM +0200, Sylvestre Ledru wrote:

Add missing dlclose()

S





 From d0926b84047f281a29dc51bbd0a4bdda01a5c63f Mon Sep 17 00:00:00 2001
From: Sylvestre Ledru 
Date: Sun, 14 May 2017 11:28:38 +0200
Subject: [PATCH 4/5] 2017-05-14  Sylvestre Ledru  

* plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)
---
  gcc/plugin.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/plugin.c b/gcc/plugin.c
index cfd6ef25036..903a197b78b 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin)
  
if ((err = dlerror ()) != NULL)

  {
+  dlclose(dl_handle);
error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name,
   plugin->full_name, err);
return false;
@@ -625,10 +626,12 @@ try_init_one_plugin (struct plugin_name_args *plugin)
/* Call the plugin-provided initialization routine with the arguments.  */
if ((*plugin_init) (plugin, &gcc_version))
  {
+  dlclose(dl_handle);


These seem like unimportant, but real leaks so they seem correct.


error ("fail to initialize plugin %s", plugin->full_name);
return false;
  }
  
+  dlclose(dl_handle);


Does this part pass the plugin tests? because it seems suspicious, if
the plugin's init function registered any callbacks which it almost
certainly did, then we'd be holding function pointers into the plugin
after we dlclosed our only reference to it.  We don't need to call any
more functions with the handle, but I think we want to morally leak it
here to ensure the plugin is loaded for the entire run of the compiler.
One might argue this deserves a comment.  Whether or not to add markup 
to silence Coverity is another issue :-)


jeff


Re: [PATCH] plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)

2017-05-15 Thread Jeff Law

On 05/14/2017 09:30 AM, Sylvestre Ledru wrote:


Le 14/05/2017 à 12:40, Trevor Saunders a écrit :

On Sun, May 14, 2017 at 11:59:40AM +0200, Sylvestre Ledru wrote:

Add missing dlclose()

S


 From d0926b84047f281a29dc51bbd0a4bdda01a5c63f Mon Sep 17 00:00:00 2001
From: Sylvestre Ledru
Date: Sun, 14 May 2017 11:28:38 +0200
Subject: [PATCH 4/5] 2017-05-14  Sylvestre Ledru

* plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)
---
  gcc/plugin.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/plugin.c b/gcc/plugin.c
index cfd6ef25036..903a197b78b 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -617,6 +617,7 @@ try_init_one_plugin (struct plugin_name_args *plugin)
  
if ((err = dlerror ()) != NULL)

  {
+  dlclose(dl_handle);
error ("cannot find %s in plugin %s\n%s", str_plugin_init_func_name,
   plugin->full_name, err);
return false;
@@ -625,10 +626,12 @@ try_init_one_plugin (struct plugin_name_args *plugin)
/* Call the plugin-provided initialization routine with the arguments.  */
if ((*plugin_init) (plugin, &gcc_version))
  {
+  dlclose(dl_handle);

These seem like unimportant, but real leaks so they seem correct.


error ("fail to initialize plugin %s", plugin->full_name);
return false;
  }
  
+  dlclose(dl_handle);

Does this part pass the plugin tests? because it seems suspicious, if
the plugin's init function registered any callbacks which it almost
certainly did, then we'd be holding function pointers into the plugin
after we dlclosed our only reference to it.  We don't need to call any
more functions with the handle, but I think we want to morally leak it
here to ensure the plugin is loaded for the entire run of the compiler.


Indeed, false positive marked in the coverity interface.
New patch attached

S


0001-2017-05-14-Sylvestre-Ledru-sylvestre-debian.org.patch


 From 08f3fb989f6b6ee56e1d4d9674e743dd563a0904 Mon Sep 17 00:00:00 2001
From: Sylvestre Ledru
Date: Sun, 14 May 2017 11:28:38 +0200
Subject: [PATCH 1/2] 2017-05-14  Sylvestre Ledru

* plugin.c (try_init_one_plugin): Fix ressource leaks (CID 726637)

The second version is fine.

jeff


  1   2   >