Re: [PATCH][v4] GIMPLE store merging pass

2016-09-30 Thread Richard Biener
On Thu, 29 Sep 2016, Kyrill Tkachov wrote:

> Hi Richard,
> 
> Thanks for the detailed comments, I'll be working on addressing them.
> Below are answers to some of your questions:
> 
> On 29/09/16 11:45, Richard Biener wrote:
> > 
> > +
> > +  /* If we're inserting a non-bytesized width or not at a byte boundary
> > + use an intermediate wide_int to perform the bit-insertion correctly.
> > */
> > +  if (sub_byte_op_p
> > +  || (TREE_CODE (expr) == INTEGER_CST
> > + && mode_for_size (bitlen, MODE_INT, 0) == BLKmode))
> > I wonder when we have BLKmode here ...
> > 
> > > +{
> > > +  unsigned int byte_size = last_byte - first_byte + 1;
> > > +
> > > +  /* The functions native_encode_expr/native_interpret_expr uses the
> > > +  TYPE_MODE of the type to determine the number of bytes to write/read
> > > +  so if we want to process a number of bytes that does not have a
> > > +  TYPE_MODE of equal size we need to use a type that has a valid mode
> > > +  for it.  */
> > > +
> > > +  machine_mode mode
> > > + = smallest_mode_for_size (byte_size * BITS_PER_UNIT, MODE_INT);
> > > +  tree dest_int_type
> > > + = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), UNSIGNED);
> > > +  byte_size = GET_MODE_SIZE (mode);
> > ... how we ever get non-BLKmode here.
> 
> smallest_mode_for_size is guaranteed to never return BLKmode.
> It returns a mode that contains at least the requested number of bits.
> mode_for_size returns BLKmode if no mode fits the exact number of bits.

But then it's really about GET_MODE_SIZE (TYPE_MODE (expr)) vs.
TYPE_PRECISION (expr).  I see no need to invoke [smallest_]mode_for_size.

> > 
> > > +  /* The region from the byte array that we're inserting into.  */
> > > +  tree ptr_wide_int
> > > + = native_interpret_expr (dest_int_type, ptr + first_byte,
> > > +  total_bytes);
> > > +
> > > +  gcc_assert (ptr_wide_int);
> > > +  wide_int dest_wide_int
> > > + = wi::to_wide (ptr_wide_int, TYPE_PRECISION (dest_int_type));
> > > +  wide_int expr_wide_int
> > > + = wi::to_wide (tmp_int, byte_size * BITS_PER_UNIT);
> > > +  if (BYTES_BIG_ENDIAN)
> > > + {
> > > +   unsigned int insert_pos
> > > + = byte_size * BITS_PER_UNIT - bitlen - (bitpos % BITS_PER_UNIT);
> > > +   dest_wide_int
> > > + = wi::insert (dest_wide_int, expr_wide_int, insert_pos, bitlen);
> > > + }
> > > +  else
> > > + dest_wide_int = wi::insert (dest_wide_int, expr_wide_int,
> > > + bitpos % BITS_PER_UNIT, bitlen);
> > > +
> > > +  tree res = wide_int_to_tree (dest_int_type, dest_wide_int);
> > > +  native_encode_expr (res, ptr + first_byte, total_bytes, 0);
> > > +
> > OTOH this whole dance looks as complicated and way more expensive than
> > using native_encode_expr into a temporary buffern and then a
> > manually implemented "bit-merging" of it at ptr + first_byte + bitpos.
> > AFAICS that operation is even endianess agnostic.
> 
> I did try implementing that, but it kept blowing up in big-endian.

I'm really curious how ;)  And I'm willing to help debugging in
case you have sth basic working.  I suggest to work on a byte
granularity here (bytes have no endianess!).

> I found it to be very fiddly. I can try again, but we'll see how it goes...
> 
> 
> 
> > > +
> > > +bool
> > > +pass_store_merging::terminate_and_process_all_chains (basic_block bb)
> > > +{
> > > +  hash_map::iterator
> > > iter
> > > += m_stores.begin ();
> > > +  bool ret = false;
> > > +  for (; iter != m_stores.end (); ++iter)
> > > +ret |= terminate_and_release_chain ((*iter).first);
> > > +
> > > +  if (ret)
> > > +gimple_purge_dead_eh_edges (bb);
> > Why do you need this?  I don't see how merging stores should effect EH
> > edges at all -- unless you are removing EH side-effects (ISTR you
> > are not merging cross-BB).
> 
> I was seeing a testsuite failure in the C++ testsuite,
> an ICE about EH edges. However, when I rerun the testsuite today
> without the gimple_purge_dead_eh_edges call I don't see any fallout...
> I have been rebasing the patch against trunk regularly so maybe something
> change in the underlying trunk since then...

I'd doubt that.  I don't see any check for whether a store may throw
internally so that missing check certainly would explain things.
You'd need -fnon-call-exceptions but you also need to have an
interesting to achieve setup where you have one store of a group marked as
not throwing and one as throwing.

I'd say throw some stmt_can_throw_internal in and mark such stores as
invalid and remove the edge purging.  That should do the trick.

> 
> 
> > +
> > +bool
> > +imm_store_chain_info::coalesce_immediate_stores ()
> > +{
> > +  /* Anything less can't be processed.  */
> > +  if (m_store_info.length () < 2)
> > +return false;
> > +
> > +  if (dump_file && (dump_flags & TDF_DETAILS))
> > +fprintf (dump_file, "Attempting to coalesce %u stores in chain.\n",
> > +m_store

Re: [PATCH] Fix (part of) PR77399

2016-09-30 Thread Richard Biener
On Wed, 28 Sep 2016, Richard Biener wrote:

> 
> This fixes the original request in PR77399, better handling of
> 
> typedef int   v4si __attribute__((vector_size(16)));
> typedef float v4sf __attribute__((vector_size(16)));
> v4sf vec_cast(v4si f)
> {
>   return (v4sf){f[0], f[1], f[2], f[3]};
> }
> 
> which nicely fits into the existing simplify_vector_constructor code.
> 
> Bootstrap / regtest pending on x86_64-unknown-linux-gnu.

This is the variant I applied (with some fixed issues).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2016-09-30  Richard Biener  

PR tree-optimization/77399
* tree-ssa-forwprop.c (simplify_vector_constructor): Handle
float <-> int conversions.

* gcc.dg/tree-ssa/forwprop-35.c: New testcase.

Index: gcc/tree-ssa-forwprop.c
===
*** gcc/tree-ssa-forwprop.c (revision 240612)
--- gcc/tree-ssa-forwprop.c (working copy)
*** simplify_vector_constructor (gimple_stmt
*** 1953,1959 
gimple *def_stmt;
tree op, op2, orig, type, elem_type;
unsigned elem_size, nelts, i;
!   enum tree_code code;
constructor_elt *elt;
unsigned char *sel;
bool maybe_ident;
--- 1953,1959 
gimple *def_stmt;
tree op, op2, orig, type, elem_type;
unsigned elem_size, nelts, i;
!   enum tree_code code, conv_code;
constructor_elt *elt;
unsigned char *sel;
bool maybe_ident;
*** simplify_vector_constructor (gimple_stmt
*** 1970,1975 
--- 1970,1976 
  
sel = XALLOCAVEC (unsigned char, nelts);
orig = NULL;
+   conv_code = ERROR_MARK;
maybe_ident = true;
FOR_EACH_VEC_SAFE_ELT (CONSTRUCTOR_ELTS (op), i, elt)
  {
*** simplify_vector_constructor (gimple_stmt
*** 1984,1989 
--- 1985,2010 
if (!def_stmt)
return false;
code = gimple_assign_rhs_code (def_stmt);
+   if (code == FLOAT_EXPR
+ || code == FIX_TRUNC_EXPR)
+   {
+ op1 = gimple_assign_rhs1 (def_stmt);
+ if (conv_code == ERROR_MARK)
+   {
+ if (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (elt->value)))
+ != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op1
+   return false;
+ conv_code = code;
+   }
+ else if (conv_code != code)
+   return false;
+ if (TREE_CODE (op1) != SSA_NAME)
+   return false;
+ def_stmt = SSA_NAME_DEF_STMT (op1);
+ if (! is_gimple_assign (def_stmt))
+   return false;
+ code = gimple_assign_rhs_code (def_stmt);
+   }
if (code != BIT_FIELD_REF)
return false;
op1 = gimple_assign_rhs1 (def_stmt);
*** simplify_vector_constructor (gimple_stmt
*** 1997,2003 
{
  if (TREE_CODE (ref) != SSA_NAME)
return false;
! if (!useless_type_conversion_p (type, TREE_TYPE (ref)))
return false;
  orig = ref;
}
--- 2018,2026 
{
  if (TREE_CODE (ref) != SSA_NAME)
return false;
! if (! VECTOR_TYPE_P (TREE_TYPE (ref))
! || ! useless_type_conversion_p (TREE_TYPE (op1),
! TREE_TYPE (TREE_TYPE (ref
return false;
  orig = ref;
}
*** simplify_vector_constructor (gimple_stmt
*** 2009,2016 
if (i < nelts)
  return false;
  
if (maybe_ident)
! gimple_assign_set_rhs_from_tree (gsi, orig);
else
  {
tree mask_type, *mask_elts;
--- 2032,2050 
if (i < nelts)
  return false;
  
+   if (! VECTOR_TYPE_P (TREE_TYPE (orig))
+   || (TYPE_VECTOR_SUBPARTS (type)
+ != TYPE_VECTOR_SUBPARTS (TREE_TYPE (orig
+ return false;
+ 
if (maybe_ident)
! {
!   if (conv_code == ERROR_MARK)
!   gimple_assign_set_rhs_from_tree (gsi, orig);
!   else
!   gimple_assign_set_rhs_with_ops (gsi, conv_code, orig,
!   NULL_TREE, NULL_TREE);
! }
else
  {
tree mask_type, *mask_elts;
*** simplify_vector_constructor (gimple_stmt
*** 2028,2034 
for (i = 0; i < nelts; i++)
mask_elts[i] = build_int_cst (TREE_TYPE (mask_type), sel[i]);
op2 = build_vector (mask_type, mask_elts);
!   gimple_assign_set_rhs_with_ops (gsi, VEC_PERM_EXPR, orig, orig, op2);
  }
update_stmt (gsi_stmt (*gsi));
return true;
--- 2062,2079 
for (i = 0; i < nelts; i++)
mask_elts[i] = build_int_cst (TREE_TYPE (mask_type), sel[i]);
op2 = build_vector (mask_type, mask_elts);
!   if (conv_code == ERROR_MARK)
!   gimple_assign_set_rhs_with_ops (gsi, VEC_PERM_EXPR, orig, orig, op2);
!   else
!   {
! gimple *perm
!   = gimple_build_assign (make_ssa_name (TREE_TYPE (orig)),
!  VEC_PERM_EXPR, orig, orig, op2);
!   

[PATCH] Improve VRP range intersection

2016-09-30 Thread Richard Biener

I noticed we intersect ~[a_1, a_1] and [2, 2] to ~[a_1, a_1].  While
we don't generally want to choose an integral range a singleton integral
range is always preferable.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-09-30  Richard Biener  

* tree-vrp.c (intersect_ranges): If we failed to handle
the intersection choose a constant singleton range if available.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 240645)
+++ gcc/tree-vrp.c  (working copy)
@@ -8555,7 +8555,16 @@ intersect_ranges (enum value_range_type
 
   /* As a fallback simply use { *VRTYPE, *VR0MIN, *VR0MAX } as
  result for the intersection.  That's always a conservative
- correct estimate.  */
+ correct estimate unless VR1 is a constant singleton range
+ in which case we choose that.  */
+  if (vr1type == VR_RANGE
+  && is_gimple_min_invariant (vr1min)
+  && vrp_operand_equal_p (vr1min, vr1max))
+{
+  *vr0type = vr1type;
+  *vr0min = vr1min;
+  *vr0max = vr1max;
+}
 
   return;
 }


Re: Explicitly list all tree codes in gcc/tree-streamer.c:record_common_node (was: [PR lto/77458] Avoid ICE in offloading with differing _FloatN, _FloatNx types)

2016-09-30 Thread Richard Biener
On Thu, Sep 29, 2016 at 4:48 PM, Thomas Schwinge
 wrote:
> Hi Richard!
>
> On Mon, 19 Sep 2016 13:25:01 +0200, Richard Biener 
>  wrote:
>> On Mon, Sep 19, 2016 at 1:19 PM, Thomas Schwinge
>>  wrote:
>> > On Mon, 19 Sep 2016 10:18:35 +0200, Richard Biener 
>> >  wrote:
>> >> On Fri, Sep 16, 2016 at 3:32 PM, Thomas Schwinge
>> >>  wrote:
>> >> > --- gcc/tree-streamer.c
>> >> > +++ gcc/tree-streamer.c
>> >> > @@ -278,9 +278,23 @@ record_common_node (struct streamer_tree_cache_d 
>> >> > *cache, tree node)
>> >> >streamer_tree_cache_append (cache, node, cache->nodes.length ());
>> >> >
>> >> >if (POINTER_TYPE_P (node)
>> >> > -  || TREE_CODE (node) == COMPLEX_TYPE
>> >> >|| TREE_CODE (node) == ARRAY_TYPE)
>> >> >  record_common_node (cache, TREE_TYPE (node));
>> >> > +  else if (TREE_CODE (node) == COMPLEX_TYPE)
>> >> > [...]
>> >> >else if (TREE_CODE (node) == RECORD_TYPE)
>
>> [looks to me we miss handling of vector type components alltogether,
>> maybe there are no global vector type trees ...]
>
> Looks like it, yes.  Would a patch like the following be reasonable,
> which explicitly lists/handles all expected tree codes, or is something
> like that not feasible?  (That's a subset of tree codes I gathered by a
> partial run of the GCC testsuite, and libgomp testsuite; not claiming
> this is complete.)

I think it would be a nice thing to have indeed.

So -- I'm inclined to approve this patch ;)

Thanks,
Richard.

> commit f28dd9618be8a26c6a75ee089f1755e4e0281106
> Author: Thomas Schwinge 
> Date:   Thu Sep 29 16:35:19 2016 +0200
>
> Explicitly list all tree codes in gcc/tree-streamer.c:record_common_node
> ---
>  gcc/tree-streamer.c | 32 +---
>  1 file changed, 25 insertions(+), 7 deletions(-)
>
> diff --git gcc/tree-streamer.c gcc/tree-streamer.c
> index 6ada89a..8567a81 100644
> --- gcc/tree-streamer.c
> +++ gcc/tree-streamer.c
> @@ -303,17 +303,32 @@ record_common_node (struct streamer_tree_cache_d 
> *cache, tree node)
>   in the cache as hash value.  */
>streamer_tree_cache_append (cache, node, cache->nodes.length ());
>
> -  if (POINTER_TYPE_P (node)
> -  || TREE_CODE (node) == ARRAY_TYPE)
> -record_common_node (cache, TREE_TYPE (node));
> -  else if (TREE_CODE (node) == COMPLEX_TYPE)
> +  switch (TREE_CODE (node))
>  {
> +case ERROR_MARK:
> +case FIELD_DECL:
> +case FIXED_POINT_TYPE:
> +case IDENTIFIER_NODE:
> +case INTEGER_CST:
> +case INTEGER_TYPE:
> +case POINTER_BOUNDS_TYPE:
> +case REAL_TYPE:
> +case TREE_LIST:
> +case VOID_CST:
> +case VOID_TYPE:
> +  /* No recursion.  */
> +  break;
> +case POINTER_TYPE:
> +case REFERENCE_TYPE:
> +case ARRAY_TYPE:
> +  record_common_node (cache, TREE_TYPE (node));
> +  break;
> +case COMPLEX_TYPE:
>/* Verify that a complex type's component type (node_type) has been
>  handled already (and we thus don't need to recurse here).  */
>verify_common_node_recorded (cache, TREE_TYPE (node));
> -}
> -  else if (TREE_CODE (node) == RECORD_TYPE)
> -{
> +  break;
> +case RECORD_TYPE:
>/* The FIELD_DECLs of structures should be shared, so that every
>  COMPONENT_REF uses the same tree node when referencing a field.
>  Pointer equality between FIELD_DECLs is used by the alias
> @@ -322,6 +337,9 @@ record_common_node (struct streamer_tree_cache_d *cache, 
> tree node)
>  nonoverlapping_component_refs_of_decl_p).  */
>for (tree f = TYPE_FIELDS (node); f; f = TREE_CHAIN (f))
> record_common_node (cache, f);
> +  break;
> +default:
> +  gcc_unreachable ();
>  }
>  }
>
>
>
> Grüße
>  Thomas


[PATCH] Reduce stack usage in sha512 (PR target/77308)

2016-09-30 Thread Bernd Edlinger
Hi,

this patch mitigates the excessive stack usage on arm in code
that does lots of int64 shift ops like sha512.

It reduces the stack usage in that example from 4K to 2K while
less than 0.5K would be expected.

In all cases the additional set instructions are optimized later
so that this caused no code size increase, but just made
LRA's job a bit easier.

It does certainly not solve the problem completely but at least
improve the stability, in an area that I'd call security relevant.


Boot-strapped and reg-tested on arm-linux-gnueabihf.
Is it OK for trunk?


Thanks
Bernd.2016-09-29  Bernd Edlinger  

	PR target/77308
	* config/arm/arm.c (arm_emit_coreregs_64bit_shift): Clear the result
	register explicitly.

Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c	(revision 239624)
+++ gcc/config/arm/arm.c	(working copy)
@@ -29159,6 +29159,7 @@ arm_emit_coreregs_64bit_shift (enum rtx_code code,
 	  /* Shifts by a constant less than 32.  */
 	  rtx reverse_amount = GEN_INT (32 - INTVAL (amount));
 
+	  emit_insn (SET (out, const0_rtx));
 	  emit_insn (SET (out_down, LSHIFT (code, in_down, amount)));
 	  emit_insn (SET (out_down,
 			  ORR (REV_LSHIFT (code, in_up, reverse_amount),
@@ -29170,12 +29171,11 @@ arm_emit_coreregs_64bit_shift (enum rtx_code code,
 	  /* Shifts by a constant greater than 31.  */
 	  rtx adj_amount = GEN_INT (INTVAL (amount) - 32);
 
+	  emit_insn (SET (out, const0_rtx));
 	  emit_insn (SET (out_down, SHIFT (code, in_up, adj_amount)));
 	  if (code == ASHIFTRT)
 	emit_insn (gen_ashrsi3 (out_up, in_up,
 GEN_INT (31)));
-	  else
-	emit_insn (SET (out_up, const0_rtx));
 	}
 }
   else


Re: Implement P0001R1 - C++17 removal of register storage class specifier

2016-09-30 Thread Jakub Jelinek
Hi!

On Thu, Sep 29, 2016 at 10:57:07PM +, Joseph Myers wrote:
> This is missing documentation of the new -Wregister option in invoke.texi.

While I had it in my head when working on the patch, I forgot to do that in the 
end.
Fixed thusly, ok for trunk?

2016-09-30  Jakub Jelinek  

* doc/invoke.texi (-Wregister): Document.

--- gcc/doc/invoke.texi.jj  2016-09-29 22:53:11.0 +0200
+++ gcc/doc/invoke.texi 2016-09-30 09:55:28.819581224 +0200
@@ -213,7 +213,7 @@ in the following sections.
 -Wabi=@var{n}  -Wabi-tag  -Wconversion-null  -Wctor-dtor-privacy @gol
 -Wdelete-non-virtual-dtor -Wliteral-suffix -Wmultiple-inheritance @gol
 -Wnamespaces -Wnarrowing @gol
--Wnoexcept -Wnon-virtual-dtor  -Wreorder @gol
+-Wnoexcept -Wnon-virtual-dtor  -Wreorder -Wregister @gol
 -Weffc++  -Wstrict-null-sentinel -Wtemplates @gol
 -Wno-non-template-friend  -Wold-style-cast @gol
 -Woverloaded-virtual  -Wno-pmf-conversions @gol
@@ -2840,6 +2840,15 @@ case it is possible but unsafe to delete
 class through a pointer to the class itself or base class.  This
 warning is automatically enabled if @option{-Weffc++} is specified.
 
+@item -Wregister @r{(C++ and Objective-C++ only)}
+@opindex Wregister
+@opindex Wno-register
+Warn on uses of the @code{register} storage class specifier, except
+when it is part of the GNU @ref{Explicit Register Variables} extension.
+The use of the @code{register} keyword as storage class specifier has
+been deprecated in C++11 and removed in C++17.
+Enabled by default with @option{-std=c++1z}.
+
 @item -Wreorder @r{(C++ and Objective-C++ only)}
 @opindex Wreorder
 @opindex Wno-reorder


Jakub


[PATCH, DOC] Enhance document of asan-use-after-return param.

2016-09-30 Thread Martin Liška
Hello.

Even though we enable by default asan-use-after-return parameter (when 
-fsanitize=address is selected),
the runtime does not check use after return by default. I would consider it 
useful to document.

It's quite similar to -fsanitize=recover and halt_on_error=0 situation: one has 
to enable both
to really receive requested behavior.

Ready to be installed?
Thanks,
Martin
>From 165a90fb7a8a91e9196f641ff644b38c4e1b7f94 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 30 Sep 2016 10:07:17 +0200
Subject: [PATCH] Enhance document of asan-use-after-return param.

gcc/ChangeLog:

2016-09-30  Martin Liska  

	* doc/invoke.texi: Document asan-use-after-return that
	it's disabled by default in runtime.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8a84e4f..0121560 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10034,6 +10034,10 @@ is enabled by default when using @option{-fsanitize=address} option.
 To disable use-after-return detection use 
 @option{--param asan-use-after-return=0}.
 
+Note: The check is disabled by default at runtime.  To enable the check,
+you should set environment variable @env{ASAN_OPTIONS} to
+@code{detect_stack_use_after_return=1}.
+
 @item asan-instrumentation-with-call-threshold
 If number of memory accesses in function being instrumented
 is greater or equal to this number, use callbacks instead of inline checks.
-- 
2.9.2



Re: [PATCH, DOC] Enhance document of asan-use-after-return param.

2016-09-30 Thread Jakub Jelinek
On Fri, Sep 30, 2016 at 10:17:45AM +0200, Martin Liška wrote:
> Even though we enable by default asan-use-after-return parameter (when 
> -fsanitize=address is selected),
> the runtime does not check use after return by default. I would consider it 
> useful to document.
> 
> It's quite similar to -fsanitize=recover and halt_on_error=0 situation: one 
> has to enable both
> to really receive requested behavior.
> 
> Ready to be installed?
> Thanks,
> Martin

> >From 165a90fb7a8a91e9196f641ff644b38c4e1b7f94 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Fri, 30 Sep 2016 10:07:17 +0200
> Subject: [PATCH] Enhance document of asan-use-after-return param.
> 
> gcc/ChangeLog:
> 
> 2016-09-30  Martin Liska  
> 
>   * doc/invoke.texi: Document asan-use-after-return that
>   it's disabled by default in runtime.

Ok, thanks.

Jakub


RE: [PATCH] Disable compact casesi patterns for arcv2

2016-09-30 Thread Claudiu Zissulescu
 
> I wonder if we should warn for the TARGET_V2 case?  Currently if the
> option is supplied on an ARCv2 (-mcompact-casesi) then the option is
> silently ignored.  This might confuse some users.

Good idea, I will update the docs accordingly.

> 
> In the non TARGET_V2 case I notice that the option is _always_
> enabled, with no option of disabling the option.  If we add a check of
> global_options_set then we can make this smarter, default on, but can
> still be tuned off if a user ever wants to.  The alternative would be
> to entirely remove the TARGET_COMPACT_CASESI flag altogether?

I would prefer to remove compact_casesi feature entirely as it is a trouble 
maker rather than a helper. 

> 
> While I was thinking about this I wrote the code below, it probably
> needs polishing, but gives an idea of what I have in mind.  What do
> you think?

It looks good, I will add the docs update. Coming back to u asap.
Claudiu


Re: [PATCH] Reduce stack usage in sha512 (PR target/77308)

2016-09-30 Thread Richard Biener
On Fri, Sep 30, 2016 at 9:48 AM, Bernd Edlinger
 wrote:
> Hi,
>
> this patch mitigates the excessive stack usage on arm in code
> that does lots of int64 shift ops like sha512.
>
> It reduces the stack usage in that example from 4K to 2K while
> less than 0.5K would be expected.
>
> In all cases the additional set instructions are optimized later
> so that this caused no code size increase, but just made
> LRA's job a bit easier.
>
> It does certainly not solve the problem completely but at least
> improve the stability, in an area that I'd call security relevant.
>
>
> Boot-strapped and reg-tested on arm-linux-gnueabihf.
> Is it OK for trunk?

A comment before the SETs and a testcase would be nice.  IIRC
we do have stack size testcases via using -fstack-usage.

Richard.

>
> Thanks
> Bernd.


Re: [PATCH, Fortran] PR fortran/77782 - ICE in gfc_get_union_type

2016-09-30 Thread Andre Vehreschild
Hi Fritz,

looks OK to me.

- Andre

On Thu, 29 Sep 2016 10:03:58 -0400
Fritz Reese  wrote:

> ICE in [1] is due to an incomplete fix for PR fortran/77327 (r239819,
> see [2],[3]). Specifically in interface.c (gfc_compare_derived_types)
> I overlooked the case where FL_UNION type symbols could be compared as
> equal to FL_STRUCTURE type symbols, which is _never_ correct. The
> faulty logic causes a UNION to be considered equal to a STRUCTURE,
> thus the union receives the structure's backend declaration. Obviously
> everything goes haywire from there.
> 
> Attached is the [obvious] fix. Will commit to trunk soon, barring any
> concerns from others.
> 
> ---
> Fritz Reese
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77782
> [2] https://gcc.gnu.org/ml/fortran/2016-08/msg00144.html
> [3] https://gcc.gnu.org/ml/fortran/2016-08/msg00145.html
> 
> 2016-09-29  Fritz Reese  
> 
> Fix ICE caused by union types comparing equal to structure types.
> 
> PR fortran/77782
> * gcc/fortran/interface.c (gfc_compare_derived_types): Use
> gfc_compare_union_types to compare union types.
> 
> PR fortran/77782
> * gcc/testsuite/gfortran.dg/dec_structure_16.f90: New testcase.


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH, Fortran] PR fortran/77764 - ICE in is_anonymous_component

2016-09-30 Thread Andre Vehreschild
Hi Fritz,

just out of curiosity: A structure typed can't be used for a class object? Like:

structure /T/
  integer :: something
end structure

class(T), allocatable :: foo

end

When the above *is* allowed then I miss some CLASS_DATA (...) in the code. If
not, everything is fine.

- Andre

On Thu, 29 Sep 2016 10:30:13 -0400
Fritz Reese  wrote:

> ICE in [1] is due to failure to null-guard map components in
> gfc_compare_union_types. Attached is [obvious] fix - will commit soon
> without complaints.
> 
> ---
> Fritz Reese
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77764
> 
> 2016-09-29  Fritz Reese  
> 
> Fix ICE for maps with zero components.
> 
> PR fortran/77764
> * gcc/fortran/interface.c (gfc_compare_union_types): Null-guard map
> components.
> 
> PR fortran/77764
> * gcc/testsuite/gfortran.dg/dec_union_8.f90: New testcase.


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


RE: [PATCH 2/2] [ARC] [libgcc] Fix defines

2016-09-30 Thread Claudiu Zissulescu
> > +   MPYHU   DBL0H,r12,DBL0H
> 
> Is there a reason that instruction should be uppercase?
> 

Yes, MPYHU is a macro which selects the right mnemonic depending on which CPU 
you are going to compile (i.e., mpyhu for ARCv1 and mpymu for ARCv2), see 
arc-ieee-754.h.

Thanks,
Claudiu


RE: [PATCH 1/2] [ARC] [libgcc] Add support for QuarkSE processor.

2016-09-30 Thread Claudiu Zissulescu

> There's significant whitespace changes in lib1funcs.S that's not
> mentioned in the ChangeLog, and is in parts of the file not touched by
> the actual functional changes.

Yes, there are a lot of trailing spaces which are not complying to GNU 
standards. I'm trying to correct those sloppy files as I am reviewing/changing 
them.

 > I'd personally prefer to see the whitespace changes pushed as their
> own commit as they make (for me) the diff harder to read.

I'll make them a separate commit, and push it as obvious if you do not have 
something against.

> 
> Otherwise this all looks fine.
> 

This patch needs to be pushed after the main compiler Quarkse functionality is 
added. I will keep u updated when this will happen.

Thanks,
Claudiu


Re: [accaf, Fortran, patch, v1] Generate caf-reference chains only from the first coarray reference on, and more.

2016-09-30 Thread Paul Richard Thomas
Dear Andre,

Looks good to me - OK for trunk.

Thanks

Paul

On 29 September 2016 at 14:03, Andre Vehreschild  wrote:
> Hi all,
>
> attached patch fixes an addressing issue for coarrays *in* derived types.
> Before the patch the caf runtime reference chain was generated from the start
> of the symbol to the last reference *and* the reference chain upto the coarray
> in the derived type was used to call the caf_*_by_ref () functions. The patch
> fixes this by skipping the generation of unnecessary caf runtime references.
>
> The second part fixes finding the token for coarrayed arrays. The new semantic
> is, that each allocatable array has the coarray token in its .token member,
> which the allocate_array now makes use of.
>
> Bootstrapped and regtested ok on x86_64-linux/F23. Ok for trunk?
>
> Regards,
> Andre
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Rainer Orth
Hi Martin,

>> the testcase FAILs on Solaris 12 (both SPARC and x86):
>> 
>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++11 gcov: 1 failures in line
>> counts, 0 i
>> n branch percentages, 0 in return percentages, 0 in intermediate format
>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++11  line 21: is #:should be 1
>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++14 gcov: 1 failures in line
>> counts, 0 i
>> n branch percentages, 0 in return percentages, 0 in intermediate format
>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++14  line 21: is #:should be 1
>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++98 gcov: 1 failures in line
>> counts, 0 i
>> n branch percentages, 0 in return percentages, 0 in intermediate format
>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++98  line 21: is #:should be 1
>> 
>> I haven't looked closer yet, but notice that you require constructor
>> priority support which isn't available everywhere (it is on Solaris 12,
>> but not before).
>> 
>>  Rainer
>> 
>
> Hello.
>
> Sorry for the test-breakage. The issue is really connected to fact that
> current trunk relies
> on support of dtor priority. The former implementation called the function
> __gcov_exit via atexit.
> If I understand correctly, fully support of static ctors/dtors, C++
> ctors/dtors, with combination
> of atexit cannot be done on a target w/o ctor/dtor priorities.

understood.  However, Solaris 12 *does* have support for constructor
priorities and the testcase still fails, so there's more going on here.

> Ideally we should have a macro for each target telling whether it supports
> priorities or not.
> However, we probably don't have? I would suggest to make the test
> conditional just for main
> targets which support priorities?
>
> Thoughts?

While this would take care of the testsuite failures, this creates a
terrible user experience outside of the testsuite: if we know that
-fprofile-arcs -ftest-coverage cannot work on targets without
constructor priority support, the compiler should error out with an
appropriate message instead of just creating confusing non-working
executables.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Marek Polacek
On Thu, Sep 29, 2016 at 10:16:33PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> The following patch does a few things:
> 1) fixes -Wimplicit-fallthrough -C
>(with -C the PREV_FALLTHROUGH flag is on the CPP_COMMENT token, we need
> to propagate it to the C/C++ token's flags in the FEs)
> 2) it accepts a comment in between /* FALLTHRU */ comment and the
>case/default keyword or user label, people often write:
>  ...
>  /* FALLTHRU */
> 
>  /* Rationale or description of what the following code does.  */
>  case ...:
>and forcing users to move their comments after the labels or after the
>first label might be too annoying
> 3) it adds support for some common FALLTHRU comment styles that appeared
>in GCC sources, or in Linux kernel etc., e.g.:
> 
>/*lint -fallthrough */
> 
>/* ... falls through ... */
> 
>/* else fall-through */
> 
>/* Intentional fall through.  */
> 
>/* FALLTHRU - some explanation why.  */

I haven't gone over the patch in detail yet, but I wonder if we should
also accept /* Else, fall through.  */ (to be found e.g. in aarch64-simd.md).

Marek


Re: [PATCH] Reduce stack usage in sha512 (PR target/77308)

2016-09-30 Thread Eric Botcazou
> A comment before the SETs and a testcase would be nice.  IIRC
> we do have stack size testcases via using -fstack-usage.

Or -Wstack-usage, which might be more appropriate here.

-- 
Eric Botcazou


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Jakub Jelinek
On Fri, Sep 30, 2016 at 11:26:27AM +0200, Marek Polacek wrote:
> On Thu, Sep 29, 2016 at 10:16:33PM +0200, Jakub Jelinek wrote:
> > The following patch does a few things:
> > 1) fixes -Wimplicit-fallthrough -C
> >(with -C the PREV_FALLTHROUGH flag is on the CPP_COMMENT token, we need
> > to propagate it to the C/C++ token's flags in the FEs)
> > 2) it accepts a comment in between /* FALLTHRU */ comment and the
> >case/default keyword or user label, people often write:
> >  ...
> >  /* FALLTHRU */
> > 
> >  /* Rationale or description of what the following code does.  */
> >  case ...:
> >and forcing users to move their comments after the labels or after the
> >first label might be too annoying
> > 3) it adds support for some common FALLTHRU comment styles that appeared
> >in GCC sources, or in Linux kernel etc., e.g.:
> > 
> >/*lint -fallthrough */
> > 
> >/* ... falls through ... */
> > 
> >/* else fall-through */
> > 
> >/* Intentional fall through.  */
> > 
> >/* FALLTHRU - some explanation why.  */
> 
> I haven't gone over the patch in detail yet, but I wonder if we should
> also accept /* Else, fall through.  */ (to be found e.g. in aarch64-simd.md).

Clearly people are extremely creative with these comments, maybe it would be
better to just remove the new additions from the patch I've posted (drop the
else/intentational/intentationally/... around/!!! around etc., to force
people to standardize on something), and just apply the fixes and support
for comments in between.

Jakub


[Testsuite] Use correct effective-target settings for ARM fp16-aapcs tests.

2016-09-30 Thread Matthew Wahab

The recently added tests gcc.target/arm/aapcs-{3,4}.c are intended
to check the behaviour of th ARM Alternative FP16 format. They both
check for compiler support of FP16 using dg-require-effective-target
arm_fp16_ok This is too weak since the directive is true when
fp16-format=ieee is set, as it is when the +fp16 extension is
enabled.

This patch changes the directives for both tests to
  dg-require-effective-target arm_fp16_alternative_ok
which is only enabled with fp16-format=alternative is set.

For fp16-aapcs-4.c, it was also necessary to add the
-mfp16-format=alternative to the dg-options, rather than use the
arm_fp16-alternative add-options. There seems to some interaction
between the different directives and the dg-skip-if, but I can't track
it down.

Tested for cross-compiled arm-none-eabi by running the
gcc.target/arm/arm.exp testsuite on an ARMv8.2-A emulator and on an
ARMv8-A emulator.

Ok for trunk?
Matthew

testsuite/
2016-09-28  Matthew Wawhab  

* gcc.target/arm/fp16-aapcs-3.c: Replace the arm_fp16_ok with
arm_fp16_alternative_ok as the required effective target.
* gcc.target/arm/fp16-aapcs-4.c: Likewise.  Also add
-mfp16-format=alternative to the dg-options directive and remove
the dg-add-otions directive.
>From 5ca74bbfdf2b87904ca21fcaa54952cbd1d3916c Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Wed, 28 Sep 2016 10:54:43 +0100
Subject: [PATCH] [Testsuite] Use correct effective-target settings for ARM 
 fp16-aapcs tests.

The recently added tests gcc.target/arm/aapcs-{3,4}.c are intended
to check the behaviour of th ARM Alternative FP16 format. They both
check for compiler support of FP16 using dg-require-effective-target
arm_fp16_ok This is too weak since the directive is true when
fp16-format=ieee is set, as it is when the +fp16 extension is
enabled.

This patch changes the directives for both tests to
  dg-require-effective-target arm_fp16_alternative_ok
which is only enabled with fp16-format=alternative is set.

For fp16-aapcs-4.c, it was also neccessary to add the
-mfp16-format=alternative to the dg-options, rather than use the
arm_fp16-alternative add-options. There seems to some interaction
between the different directives and the dg-skip-if, but I can't track
it down.

Tested for cross-compiled arm-none-eabi by running the
gcc.target/arm/arm.exp testsuite on an ARMv8.2-A emulator and on an
ARMv8-A emulator.

testsuite/
2016-09-28  Matthew Wawhab  

	* gcc.target/arm/fp16-aapcs-3.c: Replace the arm_fp16_ok with
	arm_fp16_alternative_ok as the required effective target.
	* gcc.target/arm/fp16-aapcs-4.c: Likewise.  Also add
	-mfp16-format=alternative to the dg-options directive and remove
	the dg-add-otions directive.
---
 gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c | 2 +-
 gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c
index b7d7e58..84fc0a0 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-3.c
@@ -1,6 +1,6 @@
 /* { dg-do compile }  */
 /* { dg-require-effective-target arm_hard_vfp_ok }  */
-/* { dg-require-effective-target arm_fp16_ok } */
+/* { dg-require-effective-target arm_fp16_alternative_ok } */
 /* { dg-options "-O2" }  */
 /* { dg-add-options arm_fp16_alternative } */
 
diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
index 4c90a56..41c7ab7 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
@@ -1,7 +1,6 @@
 /* { dg-do compile }  */
-/* { dg-require-effective-target arm_fp16_ok } */
-/* { dg-options "-mfloat-abi=softfp -O2" }  */
-/* { dg-add-options arm_fp16_alternative } */
+/* { dg-require-effective-target arm_fp16_alternative_ok } */
+/* { dg-options "-mfloat-abi=softfp -O2 -mfp16-format=alternative" }  */
 /* { dg-skip-if "incompatible float-abi" { arm*-*-* } { "-mfloat-abi=hard" } } */
 
 /* Test __fp16 arguments and return value in registers (softfp).  */
-- 
2.1.4



Re: [PATCH] Define 3-argument overloads of std::hypot for C++17 (P0030R1)

2016-09-30 Thread Szabolcs Nagy
On 29/09/16 14:37, Andre Vieira (lists) wrote:
> 
> On arm-none-eabi I'm seeing a failure for the long double type and inputs:
> { 1e-2l, 1e-4l, 1e-4l, 0.0150004999375l }
> 
> The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
> reasonable difference? Should we raise toler3 to 1e-15?
> 
> The last line is also too high:
>   { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
> Yields a frac of: 1.28198e-16
> 
> Those are the only ones that pass the 1e-16 threshold.
> 

i think the tolerance should be defined in
terms of LDBL_EPSILON (or numeric_limits).

n*LDBL_EPSILON tolerance would accept hypot
with about n ulp error.



Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Eric Botcazou
> Clearly people are extremely creative with these comments, maybe it would be
> better to just remove the new additions from the patch I've posted (drop
> the else/intentational/intentationally/... around/!!! around etc., to force
> people to standardize on something), and just apply the fixes and support
> for comments in between.

Maybe just match "fall" and "through/thru/etc" positively and "not/n't/etc" 
negatively on the same line.

-- 
Eric Botcazou


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Jakub Jelinek
On Fri, Sep 30, 2016 at 11:42:12AM +0200, Eric Botcazou wrote:
> > Clearly people are extremely creative with these comments, maybe it would be
> > better to just remove the new additions from the patch I've posted (drop
> > the else/intentational/intentationally/... around/!!! around etc., to force
> > people to standardize on something), and just apply the fixes and support
> > for comments in between.
> 
> Maybe just match "fall" and "through/thru/etc" positively and "not/n't/etc" 
> negatively on the same line.

See Tom Tromey's explanation why accepting too much is bad (at least unless
we want multiple levels).

Jakub


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Marek Polacek
On Fri, Sep 30, 2016 at 11:31:43AM +0200, Jakub Jelinek wrote:
> On Fri, Sep 30, 2016 at 11:26:27AM +0200, Marek Polacek wrote:
> > On Thu, Sep 29, 2016 at 10:16:33PM +0200, Jakub Jelinek wrote:
> > > The following patch does a few things:
> > > 1) fixes -Wimplicit-fallthrough -C
> > >(with -C the PREV_FALLTHROUGH flag is on the CPP_COMMENT token, we need
> > > to propagate it to the C/C++ token's flags in the FEs)
> > > 2) it accepts a comment in between /* FALLTHRU */ comment and the
> > >case/default keyword or user label, people often write:
> > >  ...
> > >  /* FALLTHRU */
> > > 
> > >  /* Rationale or description of what the following code does.  */
> > >  case ...:
> > >and forcing users to move their comments after the labels or after the
> > >first label might be too annoying
> > > 3) it adds support for some common FALLTHRU comment styles that appeared
> > >in GCC sources, or in Linux kernel etc., e.g.:
> > > 
> > >/*lint -fallthrough */
> > > 
> > >/* ... falls through ... */
> > > 
> > >/* else fall-through */
> > > 
> > >/* Intentional fall through.  */
> > > 
> > >/* FALLTHRU - some explanation why.  */
> > 
> > I haven't gone over the patch in detail yet, but I wonder if we should
> > also accept /* Else, fall through.  */ (to be found e.g. in 
> > aarch64-simd.md).
> 
> Clearly people are extremely creative with these comments, maybe it would be
> better to just remove the new additions from the patch I've posted (drop the
> else/intentational/intentationally/... around/!!! around etc., to force
> people to standardize on something), and just apply the fixes and support
> for comments in between.

Obviously you can get a very wide range of opinions here.  I like the patch;
while I don't think we should allow complete free form, accepting stuff like
/* ... falls through ... */
or
/* else fall-through */
is a good thing.

Marek


Re: [PATCH v2] add -fprolog-pad=N option to c-family

2016-09-30 Thread Jose E. Marchesi

In case anybody missed it, the Linux kernel side to make use
of this has also been finished meanwhile. Of course it can not
be accepted without compiler support; and this feature patch
is much more versatile than just Linux kernel live patching
on a single architecture.

How is this supposed to be exploited atomically in RISC arches such as
sparc?  In such architectures you usually need to patch several
instructions to load an absolute address into a register.

If a general mechanism is what is intended I would suggest to offer the
possibility of extending the nops _before_ the function entry point,
like in:

(a) nop   ! Load address
nop   ! Load address
nop   ! Load address
nop   ! Load address
nop   ! Jump to loaded address.
entry:
(b) nop   ! PC-relative jump to (a)
save %sp, bleh, %sp
...

So after the live-patcher patches the loading of the destination address
and the jump, it can atomically patch (b) to effectively replace the
implementation of `entry'.

Wdyt?



Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-09-30 Thread Segher Boessenkool
On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
> With transposition issue addressed, the only blocker I see are some 
> simple testcases we can add to the suite.  They don't have to be real 
> extensive.  And one motivating example for the list archives, ideally 
> the glibc malloc case.

Here are some testcases.


2016-09-30  Segher Boessenkool  

gcc/testsuite/
* gcc.target/powerpc/shrink-wrap-separate-0.c: New.
* gcc.target/powerpc/shrink-wrap-separate-1.c: New.
* gcc.target/powerpc/shrink-wrap-separate-2.c: New.


diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
new file mode 100644
index 000..dea0611
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-0.c
@@ -0,0 +1,22 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {#before\M.*\mmflr\M} } } */
+
+/* This tests if shrink-wrapping for separate components works.
+
+   r20 (a callee-saved register) is forced live at the start, so that we
+   get it saved in a prologue at the start of the function.
+   The link register only needs to be saved if x is non-zero; without
+   separate shrink-wrapping it would however be saved in the one prologue.
+   The test tests if the mflr insn ends up behind the prologue.  */
+
+void g(void);
+
+void f(int x)
+{
+   register int r20 asm("20") = x;
+   asm("#before" : : "r"(r20));
+   if (x)
+   g();
+   asm(""); // no tailcall of g
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
new file mode 100644
index 000..735b606
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */
+
+/* This tests if shrink-wrapping for separate components creates more
+   than one prologue when that is useful.  In this case, it saves the
+   link register before both the call to g and the call to h.  */
+
+void g(void) __attribute__((noreturn));
+void h(void) __attribute__((noreturn));
+
+void f(int x)
+{
+   if (x == 42)
+   g();
+   if (x == 31)
+   h();
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c 
b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
new file mode 100644
index 000..b22564a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/shrink-wrap-separate-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler {\mmflr\M.*\mbl\M.*\mmflr\M.*\mbl\M} } } */
+
+/* This tests if shrink-wrapping for separate components puts a prologue
+   inside a loop when that is useful.  In this case, it saves the link
+   register before each call: both calls happen with probability .10,
+   so saving the link register happens with .80 per execution of f on
+   average, which is smaller than 1 which you would get if you saved
+   it outside the loop.  */
+
+int *a;
+void g(void);
+
+void f(int x)
+{
+   int j;
+   for (j = 0; j < 4; j++) {
+   if (__builtin_expect(a[j], 0))
+   g();
+   asm("#" : : : "memory");
+   if (__builtin_expect(a[j], 0))
+   g();
+   a[j]++;
+   }
+}


Re: [PATCH v2] add -fprolog-pad=N option to c-family

2016-09-30 Thread Torsten Duwe
On Fri, Sep 30, 2016 at 12:01:47PM +0200, Jose E. Marchesi wrote:
> 
> How is this supposed to be exploited atomically in RISC arches such as
> sparc?  In such architectures you usually need to patch several
> instructions to load an absolute address into a register.

http://mpe.github.io/posts/2016/05/23/kernel-live-patching-for-ppc64le/

Haven't looked at sparc yet.

HTH,
Torsten



Re: [PATCH] Reduce stack usage in sha512 (PR target/77308)

2016-09-30 Thread Bernd Edlinger
Eric Botcazou wrote:
>> A comment before the SETs and a testcase would be nice.  IIRC
>> we do have stack size testcases via using -fstack-usage.
>
>Or -Wstack-usage, which might be more appropriate here.

Yes.  good idea.  I was not aware that we already have that kind of tests.

When trying to write this test. I noticed, that I did not try -Os so far.
But for -Os the stack is still the unchanged 3500 bytes.

However for embedded targets I am often inclined to use -Os, and
would certainly not expect the stack to explode...

I see in arm.md there are places like

  /* If we're optimizing for size, we prefer the libgcc calls.  */
  if (optimize_function_for_size_p (cfun))
FAIL;

  /* Expand operation using core-registers.
 'FAIL' would achieve the same thing, but this is a bit smarter.  */
  scratch1 = gen_reg_rtx (SImode);
  scratch2 = gen_reg_rtx (SImode);
  arm_emit_coreregs_64bit_shift (LSHIFTRT, operands[0], operands[1],
 operands[2], scratch1, scratch2);


.. that explains why this happens.  I think it would be better to
use the emit_coreregs for shift count >= 32, because these are
effectively 32-bit shifts.

Will try if that can be improved, and come back with the
results.


Thanks
Bernd.

Re: [accaf, Fortran, patch, v1] Generate caf-reference chains only from the first coarray reference on, and more.

2016-09-30 Thread Andre Vehreschild
Hi Paul,

thanks for the fast review. Committed as r240650.

Thanks again,

Andre

On Fri, 30 Sep 2016 11:16:48 +0200
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> Looks good to me - OK for trunk.
> 
> Thanks
> 
> Paul
> 
> On 29 September 2016 at 14:03, Andre Vehreschild  wrote:
> > Hi all,
> >
> > attached patch fixes an addressing issue for coarrays *in* derived types.
> > Before the patch the caf runtime reference chain was generated from the
> > start of the symbol to the last reference *and* the reference chain upto
> > the coarray in the derived type was used to call the caf_*_by_ref ()
> > functions. The patch fixes this by skipping the generation of unnecessary
> > caf runtime references.
> >
> > The second part fixes finding the token for coarrayed arrays. The new
> > semantic is, that each allocatable array has the coarray token in
> > its .token member, which the allocate_array now makes use of.
> >
> > Bootstrapped and regtested ok on x86_64-linux/F23. Ok for trunk?
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 240649)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,12 @@
+2016-09-30  Andre Vehreschild  
+
+	* trans-array.c (gfc_array_allocate): Use the token from coarray's
+	.token member.
+	* trans-intrinsic.c (conv_expr_ref_to_caf_ref): Only generate
+	caf-reference chains from the first coarray references on.
+	* trans-types.c (gfc_get_derived_type): Switch on mandatory .token
+	member generation for allocatable arrays in coarrays in derived types.
+
 2016-09-29  James Greenhalgh  
 
 	* options.c (gfc_post_options): Remove special case for
Index: gcc/fortran/trans-array.c
===
--- gcc/fortran/trans-array.c	(Revision 240649)
+++ gcc/fortran/trans-array.c	(Arbeitskopie)
@@ -5406,7 +5406,6 @@
   gfc_expr **lower;
   gfc_expr **upper;
   gfc_ref *ref, *prev_ref = NULL, *coref;
-  gfc_se caf_se;
   bool allocatable, coarray, dimension, alloc_w_e3_arr_spec = false;
 
   ref = expr->ref;
@@ -5531,7 +5530,6 @@
 	}
 }
 
-  gfc_init_se (&caf_se, NULL);
   gfc_start_block (&elseblock);
 
   /* Allocate memory to store the data.  */
@@ -5543,9 +5541,7 @@
 
   if (coarray && flag_coarray == GFC_FCOARRAY_LIB)
 {
-  tmp = gfc_get_tree_for_caf_expr (expr);
-  gfc_get_caf_token_offset (&caf_se, &token, NULL, tmp, NULL_TREE, expr);
-  gfc_add_block_to_block (&elseblock, &caf_se.pre);
+  token = gfc_conv_descriptor_token (se->expr);
   token = gfc_build_addr_expr (NULL_TREE, token);
 }
 
@@ -5557,7 +5553,6 @@
   else
 gfc_allocate_using_malloc (&elseblock, pointer, size, status);
 
-  gfc_add_block_to_block (&elseblock, &caf_se.post);
   if (dimension)
 {
   cond = gfc_unlikely (fold_build2_loc (input_location, NE_EXPR,
Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c	(Revision 240649)
+++ gcc/fortran/trans-intrinsic.c	(Arbeitskopie)
@@ -1110,7 +1110,7 @@
 static tree
 conv_expr_ref_to_caf_ref (stmtblock_t *block, gfc_expr *expr)
 {
-  gfc_ref *ref = expr->ref;
+  gfc_ref *ref = expr->ref, *last_comp_ref;
   tree caf_ref = NULL_TREE, prev_caf_ref = NULL_TREE, reference_type, tmp, tmp2,
   field, last_type, inner_struct, mode, mode_rhs, dim_array, dim, dim_type,
   start, end, stride, vector, nvec;
@@ -1127,8 +1127,29 @@
 
   /* Prevent uninit-warning.  */
   reference_type = NULL_TREE;
-  last_type = gfc_typenode_for_spec (&expr->symtree->n.sym->ts);
-  last_type_n = expr->symtree->n.sym->ts.type;
+
+  /* Skip refs upto the first coarray-ref.  */
+  last_comp_ref = NULL;
+  while (ref && (ref->type != REF_ARRAY || ref->u.ar.codimen == 0))
+{
+  /* Remember the type of components skipped.  */
+  if (ref->type == REF_COMPONENT)
+	last_comp_ref = ref;
+  ref = ref->next;
+}
+  /* When a component was skipped, get the type information of the last
+ component ref, else get the type from the symbol.  */
+  if (last_comp_ref)
+{
+  last_type = gfc_typenode_for_spec (&last_comp_ref->u.c.component->ts);
+  last_type_n = last_comp_ref->u.c.component->ts.type;
+}
+  else
+{
+  last_type = gfc_typenode_for_spec (&expr->symtree->n.sym->ts);
+  last_type_n = expr->symtree->n.sym->ts.type;
+}
+
   while (ref)
 {
   if (ref->type == REF_ARRAY && ref->u.ar.codimen > 0
Index: gcc/fortran/trans-types.c
===
--- gcc/fortran/trans-types.c	(Revision 240649)
+++ gcc/fortran/trans-types.c	(Arbeitskopie)
@@ -2565,7 +2565,8 @@
   if ((!c->attr.pointer && !c->attr.proc_pointer)
 	  || c->ts.u.derived->backend_decl == NULL)
 	c->ts.u.derived->backend_

Re: [PATCH] Delete GCJ

2016-09-30 Thread Marek Polacek
On Sun, Sep 11, 2016 at 01:08:56PM +0100, Andrew Haley wrote:
> On 10/09/16 12:59, NightStrike wrote:
> > Could we at least reach out and see if there's someone else who could
> > be the maintainer?  I noticed gcj patches recently, so there's still
> > interest.
> 
> 1.  It's too late.  We have been discussing this for a long time, and
> we're now doing what we decided.
> 
> 2.  Maintaining GCJ requires a lot of knowledge of both Java and GCC
> internals.  There are very few people in the world with that
> knowledge, and I'm fairly sure I know them by name.
> 
> 3.  The Classpath library is very old and is unmaintained.  The only
> practical way to update GCJ would be to use the OpenJDK class
> libraries instead, but updating GCJ to use those class libraries is a
> very substantial job.
> 
> So, I cannot prevent anyone from coming along to maintain GCJ, and
> neither would I want to.  However, such a proposal would have to be
> credible.  It is a multi-engineer-year commitment, and not just any
> ordinary engineers.

Can we move forward with this patch, then?

Marek


Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components

2016-09-30 Thread Segher Boessenkool
[ whoops, message too big, resending with the attachment compressed ]

On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
> With transposition issue addressed, the only blocker I see are some 
> simple testcases we can add to the suite.  They don't have to be real 
> extensive.  And one motivating example for the list archives, ideally 
> the glibc malloc case.

And here is the malloc testcase.

A very important (for performance) function is _int_malloc, which starts
with


static void *
_int_malloc (mstate av, size_t bytes)
{
// [ variable declarations culled ]

  if (((unsigned long) (bytes) >= (unsigned long) (size_t) (-2 * (unsigned 
long)
 __builtin_offsetof (
 struct malloc_chunk
 ,-
 fd_nextsize
 )
 )+((2 *(sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long 
double) : 2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t)) < __alignof__ 
(long double) ? __alignof__ (long double) : 2 *(sizeof(size_t))) - 1)) { 
(__libc_errno = (
 12
 )); return 0; }

  if (__builtin_expect ((av ==-
 ((
 void-
 *)0)
 ), 0))
{
  void *p = sysmalloc (nb, av);
  if (p !=-
  ((
  void-
  *)0)
  )
 alloc_perturb (p, bytes);
  return p;
}


which without separate shrink-wrapping ends up as (reordered the blocks):


.L._int_malloc:
mflr 0
li 9,-65
std 14,-144(1)
std 15,-136(1)
cmpld 7,4,9
std 16,-128(1)
std 17,-120(1)
std 18,-112(1)
std 19,-104(1)
std 20,-96(1)
std 21,-88(1)
std 22,-80(1)
std 23,-72(1)
std 0,16(1)
std 24,-64(1)
std 25,-56(1)
std 26,-48(1)
std 27,-40(1)
std 28,-32(1)
std 29,-24(1)
std 30,-16(1)
std 31,-8(1)
stdu 1,-288(1)
bgt 7,.L768
addi 14,4,23
mr 15,3
cmpldi 0,14,31
ble 0,.L769

# ...

.L768:
addis 27,2,__libc_errno@got@tprel@ha
li 19,12
ld 28,__libc_errno@got@tprel@l(27)
li 3,0
add 17,28,__libc_errno@tls
stw 19,0(17)
b .L631

# ...

.L631:
addi 1,1,288
ld 29,16(1)
ld 14,-144(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
mtlr 29
ld 25,-56(1)
ld 26,-48(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
blr
# ...

.L769:
cmpdi 1,3,0
beq 1,.L715

# ...

.L715:
li 14,32
.L635:
li 4,0
.L762:
addi 1,1,288
mr 3,14
ld 14,16(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
ld 25,-56(1)
mtlr 14
ld 26,-48(1)
ld 14,-144(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
b sysmalloc


[ I see have regrename on by default still; doesn't matter much for
this test, it's just less readable ]


With separate shrink-wrapping we get instead


.L._int_malloc:
li 9,-65
stdu 1,-288(1)
cmpld 7,4,9
bgt 7,.L811
std 14,144(1)
addi 14,4,23
std 15,152(1)
mr 15,3
cmpldi 0,14,31
std 25,232(1)
std 30,272(1)
ble 0,.L812

# ...

.L811:
addis 6,2,__libc_errno@got@tprel@ha
li 11,12
ld 10,__libc_errno@got@tprel@l(6)
li 3,0
add 12,10,__libc_errno@tls
stw 11,0(12)
b .L673

# ...

.L673:
addi 1,1,288
blr

# ...

.L812:
cmpdi 1,3,0
beq 1,.L757

# ...

.L757:
li 14,32
.L677:
mr 3,14
ld 15,152(1)
ld 14,144(1)
ld 25,232(1)
ld 30,272(1)
li 4,0
addi 1,1,288
b sysmalloc


I'm attaching the full testcase (pre-processed for powerpc64-linux).


Segher


malloc.i.gz
Description: GNU Zip compressed data


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Bernd Schmidt

On 09/30/2016 11:45 AM, Jakub Jelinek wrote:


See Tom Tromey's explanation why accepting too much is bad (at least unless
we want multiple levels).


And I still don't buy it. The case where someone writes "Don't fall 
through" is artificial to begin with, and also forgetting to put the 
break; there really isn't something for us to be concerned about.


On the other hand, we end up having to change massive amounts of 
perfectly fine code and even disabling -Werror in some places. That's 
the textbook case of an awful warning.



Bernd


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Jakub Jelinek
On Fri, Sep 30, 2016 at 12:42:20PM +0200, Bernd Schmidt wrote:
> On 09/30/2016 11:45 AM, Jakub Jelinek wrote:
> 
> >See Tom Tromey's explanation why accepting too much is bad (at least unless
> >we want multiple levels).
> 
> And I still don't buy it. The case where someone writes "Don't fall through"
> is artificial to begin with, and also forgetting to put the break; there
> really isn't something for us to be concerned about.

It doesn't have to be exactly that, fall is a substring of 200+ english, thr
is a substring of 1000+ english words, if you allow arbitrary stuff in
between or before it or after it, it can say anything, completely unrelated
to falling through across case labels.

> On the other hand, we end up having to change massive amounts of perfectly
> fine code and even disabling -Werror in some places. That's the textbook
> case of an awful warning.

First of all, the warning is only in -Wextra, so will affect minority of
programs that use it, and is likely many other a style warning, it is a good
thing to enforce consistent style of these comments or eventually the
attributes for projects that care about -Wextra, and as a bonus they will be
able to fix real bugs in their projects.  Note that lint is considerably
more strict in what it accepts.

Jakub


[PATCH] libstdc++/77795 Only declare ::gets for C++98 and C++11

2016-09-30 Thread Jonathan Wakely

As noted in Bugzilla (and pointed out in LLVM's bugzilla by Richard
Smith) we check for a ::gets() declaration with the default
-std=gnu++14 mode, so for glibc we don't find it, and then we declare
it ourselves in  even though it's not meant to exist in C++14.

This adjusts the check to use C++11, and doesn't declare it for C++14
and later. I think this fixes the regression, without introducing any
new problems.

Please take a look and double-check I haven't missed something.

PR libstdc++/77795
* acinclude.m4 (GLIBCXX_CHECK_STDIO_PROTO): Use -std=gnu++11 to check
for gets.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/c_global/cstdio [!_GLIBCXX_HAVE_GETS] (gets): Only declare
for C++98 and C++11.
* include/c_std/cstdio [!_GLIBCXX_HAVE_GETS] (gets): Likewise.
* testsuite/27_io/headers/cstdio/functions_neg.cc: New test.

I think we could also get rid of the hack in
config/os/gnu-linux/os_defines.h because it doesn't do anything:

// Provide a declaration for the possibly deprecated gets function, as
// glibc 2.15 and later does not declare gets for ISO C11 when
// __GNU_SOURCE is defined.
#if __GLIBC_PREREQ(2,15) && defined(_GNU_SOURCE)
# undef _GLIBCXX_HAVE_GETS
#endif

Firstly, neither 2.15 nor 2.16.0 has this issue (only some 2.15.xxx
versions built from version control rather than from official
releases). So we only need to check for exactly 2.15, rather than
anything >= 2.15.

Secondly, that file is included *before* c++config.h defines
_GLIBCXX_HAVE_GETS so the #undef doesn't work anyway. So I think we
might as well just get rid of it. If someone builds libstdc++ against
a 2.15.xxx without the gets declaration then we don't define
_GLIBCXX_HAVE_GETS and so we declare ::gets in .

If someone builds libstdc++ against 2.15 and then upgrades to a
glibc 2.15.xxx without gets they'll have problems, because
_GLIBCXX_HAVE_GETS will be defined to 1, but they won't have a gets
declaration in the glibc . So don't do that. If we really
needed to we could support that by changing os_defines.h to:

#if __GLIBC__ == 2 && __GLIBC_MINOR__ == 15 && defined(_GNU_SOURCE)
# define _GLIBCXX_NEED_GETS_DECL
#endif

And then in :

#if __cplusplus <= 201103L
#if !defined(_GLIBCXX_HAVE_GETS) || defined(_GLIBCXX_NEED_GETS_DECL)
extern "C" char* gets (char* __s) __attribute__((__deprecated__));
#endif
#endif

But we haven't needed that until now, so I'd prefer to just drop the
hack in os_defines.h

commit fb65b330f57bf6b6e64fa3a7728e185ff4e45918
Author: Jonathan Wakely 
Date:   Fri Sep 30 10:26:35 2016 +0100

libstdc++/77795 Only declare ::gets for C++98 and C++11

	PR libstdc++/77795
	* acinclude.m4 (GLIBCXX_CHECK_STDIO_PROTO): Use -std=gnu++11 to check
	for gets.
	* config.h.in: Regenerate.
	* configure: Regenerate.
	* include/c_global/cstdio [!_GLIBCXX_HAVE_GETS] (gets): Only declare
	for C++98 and C++11.
	* include/c_std/cstdio [!_GLIBCXX_HAVE_GETS] (gets): Likewise.
	* testsuite/27_io/headers/cstdio/functions_neg.cc: New test.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index ffead7d..d0ee45f 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -2153,6 +2153,10 @@ AC_DEFUN([GLIBCXX_CHECK_STDIO_PROTO], [
 
   AC_LANG_SAVE
   AC_LANG_CPLUSPLUS
+  # Use C++11 because a conforming  won't define gets for C++14,
+  # and we don't need a declaration for C++14 anyway.
+  ac_save_CXXFLAGS="$CXXFLAGS"
+  CXXFLAGS="$CXXFLAGS -std=gnu++11"
 
   AC_MSG_CHECKING([for gets declaration])
   AC_CACHE_VAL(glibcxx_cv_gets, [
@@ -2168,10 +2172,11 @@ AC_DEFUN([GLIBCXX_CHECK_STDIO_PROTO], [
   )])
 
   if test $glibcxx_cv_gets = yes; then
-AC_DEFINE(HAVE_GETS, 1, [Define if gets is available in .])
+AC_DEFINE(HAVE_GETS, 1, [Define if gets is available in  before C++14.])
   fi
   AC_MSG_RESULT($glibcxx_cv_gets)
 
+  CXXFLAGS="$ac_save_CXXFLAGS"
   AC_LANG_RESTORE
 ])
 
diff --git a/libstdc++-v3/include/c_global/cstdio b/libstdc++-v3/include/c_global/cstdio
index 920d109..86d524f 100644
--- a/libstdc++-v3/include/c_global/cstdio
+++ b/libstdc++-v3/include/c_global/cstdio
@@ -44,7 +44,7 @@
 #ifndef _GLIBCXX_CSTDIO
 #define _GLIBCXX_CSTDIO 1
 
-#ifndef _GLIBCXX_HAVE_GETS
+#if __cplusplus <= 201103L && !defined(_GLIBCXX_HAVE_GETS)
 extern "C" char* gets (char* __s) __attribute__((__deprecated__));
 #endif
 
diff --git a/libstdc++-v3/include/c_std/cstdio b/libstdc++-v3/include/c_std/cstdio
index a4119ba..549004c 100644
--- a/libstdc++-v3/include/c_std/cstdio
+++ b/libstdc++-v3/include/c_std/cstdio
@@ -44,7 +44,7 @@
 #include 
 #include 
 
-#ifndef _GLIBCXX_HAVE_GETS
+#if __cplusplus <= 201103L && !defined(_GLIBCXX_HAVE_GETS)
 extern "C" char* gets (char* __s) __attribute__((__deprecated__));
 #endif
 
diff --git a/libstdc++-v3/testsuite/27_io/headers/cstdio/functions_neg.cc b/libstdc++-v3/testsuite/27_io/headers/cstdio/functions_neg.cc
new file mode 100644
index 00

Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Bernd Schmidt



On 09/30/2016 12:51 PM, Jakub Jelinek wrote:

On Fri, Sep 30, 2016 at 12:42:20PM +0200, Bernd Schmidt wrote:

On 09/30/2016 11:45 AM, Jakub Jelinek wrote:


See Tom Tromey's explanation why accepting too much is bad (at least unless
we want multiple levels).


And I still don't buy it. The case where someone writes "Don't fall through"
is artificial to begin with, and also forgetting to put the break; there
really isn't something for us to be concerned about.


It doesn't have to be exactly that, fall is a substring of 200+ english, thr
is a substring of 1000+ english words, if you allow arbitrary stuff in
between or before it or after it, it can say anything, completely unrelated
to falling through across case labels.


True, but IMO irrelevant. We have to consider what is likely in real 
code. We're trying to catch likely problems without adding a prohibitive 
cost to enabling -Wextra. It's not helpful to punish people for writing 
code that is valid and documents intentional fallthrough, but does it in 
a style or a language we didn't expect.


A complete solution would require working AI; maybe someone from Google 
can contribute one. Failing that, we need heuristics, and I still like 
Michael's suggestion of not printing a warning if we see _any_ comment, 
on the grounds that this would still catch the vast majority of actual 
errors, without the huge false positive rate we currently have.



Bernd


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Marek Polacek
On Fri, Sep 30, 2016 at 12:42:20PM +0200, Bernd Schmidt wrote:
> On 09/30/2016 11:45 AM, Jakub Jelinek wrote:
> 
> > See Tom Tromey's explanation why accepting too much is bad (at least unless
> > we want multiple levels).
> 
> And I still don't buy it. The case where someone writes "Don't fall through"
> is artificial to begin with, and also forgetting to put the break; there
> really isn't something for us to be concerned about.
> 
> On the other hand, we end up having to change massive amounts of perfectly
> fine code and even disabling -Werror in some places. That's the textbook
> case of an awful warning.

On the flip side, those patches (for GCC) were quite trivial and I've done
most of the work, and it's something you only have to do once.

Disabling -Werror was only necessary because I didn't investigate the problem
deep enough, but this has been fixed now (the genattr.c part in
) and I'm about to
post a patch that removes those added -Wno-error (just waiting for aarch64
bootstrap to finish).

The warning repeatedly finds bugs (I've already fixed a bunch of them), and
prevents from more creeping in.

The comments parsing part is contentious and tricky (and GCC might be the first
compiler that attempts to do this, so we're breaking new ground), yet the
warning *is* useful, and only more so with upcoming C++17 which standardizes
[[fallthrough]];.

Marek


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Martin Liška
On 09/30/2016 11:22 AM, Rainer Orth wrote:
> Hi Martin,
> 
>>> the testcase FAILs on Solaris 12 (both SPARC and x86):
>>>
>>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++11 gcov: 1 failures in line
>>> counts, 0 i
>>> n branch percentages, 0 in return percentages, 0 in intermediate format
>>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++11  line 21: is #:should be 1
>>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++14 gcov: 1 failures in line
>>> counts, 0 i
>>> n branch percentages, 0 in return percentages, 0 in intermediate format
>>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++14  line 21: is #:should be 1
>>> +FAIL: g++.dg/gcov/pr16855.C -std=gnu++98 gcov: 1 failures in line
>>> counts, 0 i
>>> n branch percentages, 0 in return percentages, 0 in intermediate format
>>> +FAIL: g++.dg/gcov/pr16855.C  -std=gnu++98  line 21: is #:should be 1
>>>
>>> I haven't looked closer yet, but notice that you require constructor
>>> priority support which isn't available everywhere (it is on Solaris 12,
>>> but not before).
>>>
>>> Rainer
>>>
>>
>> Hello.
>>
>> Sorry for the test-breakage. The issue is really connected to fact that
>> current trunk relies
>> on support of dtor priority. The former implementation called the function
>> __gcov_exit via atexit.
>> If I understand correctly, fully support of static ctors/dtors, C++
>> ctors/dtors, with combination
>> of atexit cannot be done on a target w/o ctor/dtor priorities.
> 
> understood.  However, Solaris 12 *does* have support for constructor
> priorities and the testcase still fails, so there's more going on here.

I see, however I don't have access to such a machine. I would appreciate
if you help me to debug what's going on. Can you please send me --target=x,
so that I can at least check created assembly?

> 
>> Ideally we should have a macro for each target telling whether it supports
>> priorities or not.
>> However, we probably don't have? I would suggest to make the test
>> conditional just for main
>> targets which support priorities?
>>
>> Thoughts?
> 
> While this would take care of the testsuite failures, this creates a
> terrible user experience outside of the testsuite: if we know that
> -fprofile-arcs -ftest-coverage cannot work on targets without
> constructor priority support, the compiler should error out with an
> appropriate message instead of just creating confusing non-working
> executables.

More precisely, it does not work reliably on constructor and destructors as
we depend on an order how are ctor/dtors executed. We had the same behavior even
before my patch, but documenting that definitely worth for doing. I'll do a 
patch.

Martin

> 
>   Rainer
> 



Re: [PATCH 2/2] Extend -falign-FOO=N to N[,M[,N2[,M2]]]

2016-09-30 Thread Bernd Schmidt

On 09/29/2016 07:32 PM, Denys Vlasenko wrote:

On 09/29/2016 04:45 PM, Bernd Schmidt wrote:

On 09/28/2016 02:57 PM, Denys Vlasenko wrote:

-  /* Comes from final.c -- no real reason to change it.  */
-#define MAX_CODE_ALIGN 16
-
 case OPT_malign_loops_:
   warning_at (loc, 0, "-malign-loops is obsolete, use
-falign-loops");
-  if (value > MAX_CODE_ALIGN)
-error_at (loc, "-malign-loops=%d is not between 0 and %d",
-  value, MAX_CODE_ALIGN);
-  else
-opts->x_align_loops = 1 << value;
   return true;


That does seem to be a functional change. I'll defer to Uros.


It would be awkward to translate -malign-loops=%d et al
to comma-separated string format.
Since this warning is there for some 15 years already,
anyone who actually cares should have converted to new options
long ago.


Hmm, if it's been 15 years, maybe it's time to remove these. Could you 
submit a patch separately?



-  if (opts->x_align_functions <= 0)
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)


Are these conditions really equivalent? It looks like zero was
the default even when no -falign-functions was specified.
 Or is that overriden by init_alignments?


 {
-  if (opts->x_align_loops == 0)
+  /* -falign-foo without argument: supply one */
+  if (opts->x_flag_align_loops && !opts->x_str_align_loops)


Same here.


The execution flow for option parsing is somewhat convoluted, no doubt.

I found it experimentally that these are locations where
default alignment parameters are set when -falign-functions
is given with no arguments (or when it is implied by -O2).


I applied your latest two patches to experiment with them, and I see 
different behaviour before and after on x86_64-linux. There seems to be 
a difference in function alignment and label alignment at -O2.


I think it would be good to add testcases first to document and verify 
existing behaviour, they would then serve to show whether it is 
unchanged afterwards.


Need to look into the x86 hook thing in a bit more detail.


Bernd


Adjust fall through comments in aarch64-simd.md

2016-09-30 Thread Marek Polacek
It's unclear whether we'll handle /* Else, fall through.  */ in the nearest
future, so I'll just change it to a recognizable format instead.

When this is in, I can remove the recently added -Wno-error lines in 
Makefile.in.

Bootstrapped on aarch64-linux.  I'm checking this in.

2016-09-30  Marek Polacek  

* config/aarch64/aarch64-simd.md: Adjust fall through comments.

diff --git gcc/config/aarch64/aarch64-simd.md gcc/config/aarch64/aarch64-simd.md
index f942a54..9ce7f00 100644
--- gcc/config/aarch64/aarch64-simd.md
+++ gcc/config/aarch64/aarch64-simd.md
@@ -2443,7 +2443,7 @@
  comparison = gen_aarch64_cmlt;
  break;
}
-  /* Else, fall through.  */
+  /* Fall through.  */
 case UNGE:
   std::swap (operands[2], operands[3]);
   /* Fall through.  */
@@ -2457,7 +2457,7 @@
  comparison = gen_aarch64_cmle;
  break;
}
-  /* Else, fall through.  */
+  /* Fall through.  */
 case UNGT:
   std::swap (operands[2], operands[3]);
   /* Fall through.  */

Marek


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Nathan Sidwell

On 09/30/16 05:22, Rainer Orth wrote:


While this would take care of the testsuite failures, this creates a
terrible user experience outside of the testsuite: if we know that
-fprofile-arcs -ftest-coverage cannot work on targets without
constructor priority support, the compiler should error out with an
appropriate message instead of just creating confusing non-working
executables.


It should either
1) emit a non-prioritized static ctor
2) or use the older atexit mechanism.



[gomp4] improve collapse user var calculation

2016-09-30 Thread Nathan Sidwell
In working on tile I noticed an unnecessary modulo operation for the calculation 
of the outermost loop's user iteration variable.  By construction the modulo 
operator is useless here.  Took the opportunity of moving the modulo and 
division operations next to eachother so it'd be easier to make use of any 
divmod apparatus the target may have.


committed to gomp4.

nathan
2016-09-30  Nathan Sidwell  

	* omp-low.c (expand_oacc_collapse_vars): Avoid DIV for outermost
	collaps var.

Index: omp-low.c
===
--- omp-low.c	(revision 240653)
+++ omp-low.c	(working copy)
@@ -7591,8 +7591,16 @@ expand_oacc_collapse_vars (const struct
 	  plus_type = sizetype;
 	}
 
-  expr = fold_build2 (TRUNC_MOD_EXPR, ivar_type, ivar,
-			  fold_convert (ivar_type, collapse->iters));
+  expr = ivar;
+  if (ix)
+	{
+	  tree mod = fold_convert (ivar_type, collapse->iters);
+	  ivar = fold_build2 (TRUNC_DIV_EXPR, ivar_type, expr, mod);
+	  expr = fold_build2 (TRUNC_MOD_EXPR, ivar_type, expr, mod);
+	  ivar = force_gimple_operand_gsi (gsi, ivar, true, NULL_TREE,
+	   true, GSI_SAME_STMT);
+	}
+  
   expr = fold_build2 (MULT_EXPR, diff_type, fold_convert (diff_type, expr),
 			  collapse->step);
   expr = fold_build2 (plus_code, iter_type, collapse->base,
@@ -7601,14 +7609,6 @@ expand_oacc_collapse_vars (const struct
    true, GSI_SAME_STMT);
   gassign *ass = gimple_build_assign (loop->v, expr);
   gsi_insert_before (gsi, ass, GSI_SAME_STMT);
-
-  if (ix)
-	{
-	  expr = fold_build2 (TRUNC_DIV_EXPR, ivar_type, ivar,
-			  fold_convert (ivar_type, collapse->iters));
-	  ivar = force_gimple_operand_gsi (gsi, expr, true, NULL_TREE,
-	   true, GSI_SAME_STMT);
-	}
 }
 }
 


[wwwdocs] Make it more obvious which releases are still supported

2016-09-30 Thread Jonathan Wakely

This adjusts the front page and the news page for each release to
clarify that past releases are no longer maintained.

OK for wwwdocs?


Index: htdocs/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.1026
diff -u -r1.1026 index.html
--- htdocs/index.html	25 Aug 2016 10:55:41 -	1.1026
+++ htdocs/index.html	30 Sep 2016 12:25:30 -
@@ -121,7 +121,7 @@
 
 
 
-Releases
+Supported Releases
 
 
 GCC 6.2
Index: htdocs/gcc-2.95/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-2.95/index.html,v
retrieving revision 1.5
diff -u -r1.5 index.html
--- htdocs/gcc-2.95/index.html	28 Jun 2014 11:59:43 -	1.5
+++ htdocs/gcc-2.95/index.html	30 Sep 2016 12:25:30 -
@@ -10,6 +10,8 @@
 March 16, 2001: The GNU project and the GCC developers are
 pleased to announce the release of GCC version 2.95.3.
 
+This release is no longer maintained.
+
 Release History
 
 
Index: htdocs/gcc-3.0/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.0/index.html,v
retrieving revision 1.17
diff -u -r1.17 index.html
--- htdocs/gcc-3.0/index.html	28 Jun 2014 07:45:10 -	1.17
+++ htdocs/gcc-3.0/index.html	30 Sep 2016 12:25:30 -
@@ -15,6 +15,8 @@
 a bug-fix release for the GCC 3.0 series.
 
 
+This release series is no longer maintained.
+
 GCC used to stand for the GNU C Compiler, but since the compiler
 supports several other languages aside from C, it now stands for the
 GNU Compiler Collection.
Index: htdocs/gcc-3.1/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.1/index.html,v
retrieving revision 1.6
diff -u -r1.6 index.html
--- htdocs/gcc-3.1/index.html	28 Jun 2014 07:45:11 -	1.6
+++ htdocs/gcc-3.1/index.html	30 Sep 2016 12:25:30 -
@@ -15,6 +15,8 @@
 
 The links below still apply to GCC 3.1.1.
 
+This release series is no longer maintained.
+
 May 15, 2002
 
 The http://www.gnu.org/";>GNU project and the GCC
Index: htdocs/gcc-3.2/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.2/index.html,v
retrieving revision 1.13
diff -u -r1.13 index.html
--- htdocs/gcc-3.2/index.html	28 Jun 2014 07:45:11 -	1.13
+++ htdocs/gcc-3.2/index.html	30 Sep 2016 12:25:30 -
@@ -25,6 +25,8 @@
 Please refer to our detailed list of news,
 caveats, and bug-fixes for further information.
 
+This release series is no longer maintained.
+
 Release History
 
 
Index: htdocs/gcc-3.3/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.3/index.html,v
retrieving revision 1.21
diff -u -r1.21 index.html
--- htdocs/gcc-3.3/index.html	28 Jun 2014 07:45:11 -	1.21
+++ htdocs/gcc-3.3/index.html	30 Sep 2016 12:25:30 -
@@ -23,6 +23,8 @@
 href="https://gcc.gnu.org/onlinedocs/gcc/Contributors.html";>amazing
 group of volunteers.
 
+This release series is no longer maintained.
+
 

Re: [wwwdocs] Make it more obvious which releases are still supported

2016-09-30 Thread Richard Biener
On Fri, Sep 30, 2016 at 2:26 PM, Jonathan Wakely  wrote:
> This adjusts the front page and the news page for each release to
> clarify that past releases are no longer maintained.
>
> OK for wwwdocs?

Ok.

Thanks,
Richard.

>


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Rainer Orth
  .byte   0x4
.long   .LCFI26-.LCFI25
.byte   0xc
.byte   0x7
.byte   0x8
.align 8
.LEFDE17:
.LSFDE19:
.long   .LEFDE19-.LASFDE19
.LASFDE19:
.long   .LASFDE19-.Lframe1
.long   .LFB1054
.long   .LFE1054-.LFB1054
.byte   0
.byte   0x4
.long   .LCFI27-.LFB1054
.byte   0xe
.byte   0x10
.byte   0x86
.byte   0x2
.byte   0x4
.long   .LCFI28-.LCFI27
.byte   0xd
.byte   0x6
.byte   0x4
.long   .LCFI29-.LCFI28
.byte   0xc
.byte   0x7
.byte   0x8
.align 8
.LEFDE19:
.LSFDE21:
.long   .LEFDE21-.LASFDE21
.LASFDE21:
.long   .LASFDE21-.Lframe1
.long   .LFB1055
.long   .LFE1055-.LFB1055
.byte   0
.byte   0x4
.long   .LCFI30-.LFB1055
.byte   0xe
.byte   0x10
.byte   0x86
.byte   0x2
.byte   0x4
.long   .LCFI31-.LCFI30
.byte   0xd
.byte   0x6
.byte   0x4
.long   .LCFI32-.LCFI31
.byte   0xc
.byte   0x7
.byte   0x8
.align 8
.LEFDE21:
.LSFDE23:
.long   .LEFDE23-.LASFDE23
.LASFDE23:
.long   .LASFDE23-.Lframe1
.long   .LFB1056
.long   .LFE1056-.LFB1056
.byte   0
.byte   0x4
.long   .LCFI33-.LFB1056
.byte   0xe
.byte   0x10
.byte   0x86
.byte   0x2
.byte   0x4
.long   .LCFI34-.LCFI33
.byte   0xd
.byte   0x6
.byte   0x4
.long   .LCFI35-.LCFI34
.byte   0xc
.byte   0x7
.byte   0x8
.align 8
.LEFDE23:
.hidden __dso_handle
.ident  "GCC: (GNU) 7.0.0 20160930 (experimental) [trunk revision 
240649]"
.section.text._ZN4TestC2Ev%_ZN4TestC5Ev,"ax",@progbits
_ZN4TestC5Ev:
.section.text._ZN4TestD2Ev%_ZN4TestD5Ev,"ax",@progbits
_ZN4TestD5Ev:


Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Martin Liška
On 09/30/2016 02:31 PM, Rainer Orth wrote:
> this would be i386-pc-solaris2.12.  I'm not sure if the constructor
> priority detection works in a cross scenario.
> 
> I'm attaching the resulting assembly (although for Solaris as, the gas
> build is still running).

Hi. Sorry, I have a stupid mistake in dtor priority
(I used 65534 instead of desired 99). Please try to test it on Solaris 12
with the attached patch. I'll send the patch to ML soon.

Can you please test whether it makes any change on a solaris target w/o
prioritized ctors/dtors?

Thanks,
Martin
diff --git a/gcc/coverage.c b/gcc/coverage.c
index 0b8c0b3..a759831 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1078,7 +1078,7 @@ build_gcov_exit_decl (void)
   append_to_statement_list (stmt, &dtor);
 
   /* Generate a destructor to run it (with priority 99).  */
-  cgraph_build_static_cdtor ('D', dtor, DEFAULT_INIT_PRIORITY - 1);
+  cgraph_build_static_cdtor ('D', dtor, MAX_RESERVED_INIT_PRIORITY - 1);
 }
 
 /* Create the gcov_info types and object.  Generate the constructor


Re: [PATCH] Delete GCJ

2016-09-30 Thread Andrew Haley
On 30/09/16 11:27, Marek Polacek wrote:
> Can we move forward with this patch, then?

I've been travelling for several weeks.  However, I'm back at my desk
now, so I can move this forward.  I have all the approvals and
everybody has had time to respond.  However, I'll need to pull some
more recent changes into my tree and merge again.

Andrew.



Re: [PATCH, RFC] gcov: dump in a static dtor instead of in an atexit handler

2016-09-30 Thread Rainer Orth
Hi Martin,

> On 09/30/2016 02:31 PM, Rainer Orth wrote:
>> this would be i386-pc-solaris2.12.  I'm not sure if the constructor
>> priority detection works in a cross scenario.
>> 
>> I'm attaching the resulting assembly (although for Solaris as, the gas
>> build is still running).
>
> Hi. Sorry, I have a stupid mistake in dtor priority
> (I used 65534 instead of desired 99). Please try to test it on Solaris 12
> with the attached patch. I'll send the patch to ML soon.
>
> Can you please test whether it makes any change on a solaris target w/o
> prioritized ctors/dtors?

sure: I've added your patch to my source tree for the running bootstraps
and will have builds on Solaris 12 (with constructor priority) and S10/11
(without) available in a few hours.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Testsuite] Use correct effective-target settings for ARM fp16-aapcs tests.

2016-09-30 Thread Kyrill Tkachov


On 30/09/16 10:31, Matthew Wahab wrote:

The recently added tests gcc.target/arm/aapcs-{3,4}.c are intended
to check the behaviour of th ARM Alternative FP16 format. They both
check for compiler support of FP16 using dg-require-effective-target
arm_fp16_ok This is too weak since the directive is true when
fp16-format=ieee is set, as it is when the +fp16 extension is
enabled.

This patch changes the directives for both tests to
  dg-require-effective-target arm_fp16_alternative_ok
which is only enabled with fp16-format=alternative is set.

For fp16-aapcs-4.c, it was also necessary to add the
-mfp16-format=alternative to the dg-options, rather than use the
arm_fp16-alternative add-options. There seems to some interaction
between the different directives and the dg-skip-if, but I can't track
it down.

Tested for cross-compiled arm-none-eabi by running the
gcc.target/arm/arm.exp testsuite on an ARMv8.2-A emulator and on an
ARMv8-A emulator.

Ok for trunk?


Ok.
Thanks,
Kyrill


Matthew

testsuite/
2016-09-28  Matthew Wawhab  

* gcc.target/arm/fp16-aapcs-3.c: Replace the arm_fp16_ok with
arm_fp16_alternative_ok as the required effective target.
* gcc.target/arm/fp16-aapcs-4.c: Likewise.  Also add
-mfp16-format=alternative to the dg-options directive and remove
the dg-add-otions directive.




Re: [PATCH] Define 3-argument overloads of std::hypot for C++17 (P0030R1)

2016-09-30 Thread Szabolcs Nagy
On 30/09/16 10:35, Szabolcs Nagy wrote:
> On 29/09/16 14:37, Andre Vieira (lists) wrote:
>>
>> On arm-none-eabi I'm seeing a failure for the long double type and inputs:
>> { 1e-2l, 1e-4l, 1e-4l, 0.0150004999375l }
>>
>> The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
>> reasonable difference? Should we raise toler3 to 1e-15?
>>
>> The last line is also too high:
>>   { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
>> Yields a frac of: 1.28198e-16
>>
>> Those are the only ones that pass the 1e-16 threshold.
>>
> 
> i think the tolerance should be defined in
> terms of LDBL_EPSILON (or numeric_limits).
> 
> n*LDBL_EPSILON tolerance would accept hypot
> with about n ulp error.
> 

now i see that there are huge ulp errors..

toler = 10*eps;

should work for all formats, but currently there
is >1000 ulp error on one of the double test cases..
so tolerance is carefully set to avoid triggering
the failure there

i'd set toler to 1*eps if this test case is not
for testing hypot quality.



Re: [PATCH] Define 3-argument overloads of std::hypot for C++17 (P0030R1)

2016-09-30 Thread Jonathan Wakely

On 30/09/16 14:13 +0100, Szabolcs Nagy wrote:

On 30/09/16 10:35, Szabolcs Nagy wrote:

On 29/09/16 14:37, Andre Vieira (lists) wrote:


On arm-none-eabi I'm seeing a failure for the long double type and inputs:
{ 1e-2l, 1e-4l, 1e-4l, 0.0150004999375l }

The abs(frac) is higher than the toler: 1.73455e-16 vs 1e-16. Is that a
reasonable difference? Should we raise toler3 to 1e-15?

The last line is also too high:
  { 2147483647.l, 2147483647.l, 2147483647.l, 3719550785.027307813987l }
Yields a frac of: 1.28198e-16

Those are the only ones that pass the 1e-16 threshold.



i think the tolerance should be defined in
terms of LDBL_EPSILON (or numeric_limits).

n*LDBL_EPSILON tolerance would accept hypot
with about n ulp error.



now i see that there are huge ulp errors..

toler = 10*eps;

should work for all formats, but currently there
is >1000 ulp error on one of the double test cases..
so tolerance is carefully set to avoid triggering
the failure there

i'd set toler to 1*eps if this test case is not
for testing hypot quality.


Ed is already re-working the code, and probably the tests.

I'd prefer to wait for his new implementation, but if the FAIL is a
problem now then go ahead and adjust the tolerance.




[PATCHv2][GCC] Optimise the fpclassify builtin to perform integer operations when possible

2016-09-30 Thread Tamar Christina
Hi All,

This is v2 of the patch which adds an optimized route to the fpclassify builtin
for floating point numbers which are similar to IEEE-754 in format.

I have addressed most comments from everyone except for two things:

1) Providing a back-end hook to override the functionality. While certainly
   possible the current fpclassify doesn't provide this either. So I'd like to
   treat it as an enhancement rather than an issue.

2) Doing it in a lowering phase. If the general consensus is that this is the
   path the patch must take then I'd be happy to reconsider. However at this
   this patch does not seem to produce worse code than what there was before.

The goal is to make it faster by:
1. Trying to determine the most common case first
   (e.g. the float is a Normal number) and then the
   rest. The amount of code generated at -O2 are
   about the same +/- 1 instruction, but the code
   is much better.
2. Using integer operation in the optimized path.

At a high level, the optimized path uses integer operations
to perform the following:

  if (exponent bits aren't all set or unset)
 return Normal;
  else if (no bits are set on the number after masking out
   sign bits then)
 return Zero;
  else if (exponent has no bits set)
 return Subnormal;
  else if (mantissa has no bits set)
 return Infinite;
  else
 return NaN;

In case the optimization can't be applied the old
implementation is used as a fall-back.

A limitation with this new approach is that the exponent
of the floating point has to fit in 31 bits and the floating
point has to have an IEEE like format and values for NaN and INF
(e.g. for NaN and INF all bits of the exp must be set).

To determine this IEEE likeness a new boolean was added to real_format.

As an example, Aarch64 now generates for classification of doubles:

f:
fmovx1, d0
mov w0, 7
sbfxx2, x1, 52, 11
add w3, w2, 1
tst w3, 0x07FE
bne .L1
mov w0, 13
tst x1, 0x7fff
beq .L1
mov w0, 11
tbz x2, 0, .L1
tst x1, 0xf
mov w0, 3
mov w1, 5
cselw0, w0, w1, ne

.L1:
ret

No new tests as there are existing tests to test functionality.
glibc benchmarks ran against the builtin and this shows a 42.5%
performance gain on Aarch64.

Regression tests ran on aarch64-none-linux and arm-none-linux-gnueabi
and no regression. x86 also has no regressions and modest gains (3%).

Ok for trunk?

Thanks,
Tamar

gcc/
2016-08-25  Tamar Christina  
Wilco Dijkstra  

* gcc/builtins.c (fold_builtin_fpclassify): Added optimized version. 
* gcc/real.h (real_format): Added is_ieee_compatible field.
* gcc/real.c (ieee_single_format): Set is_ieee_compatible flag.
(mips_single_format): Likewise.
(motorola_single_format): Likewise.
(spu_single_format): Likewise.
(ieee_double_format): Likewise.
(mips_double_format): Likewise.
(motorola_double_format): Likewise.
(ieee_extended_motorola_format): Likewise.
(ieee_extended_intel_128_format): Likewise.
(ieee_extended_intel_96_round_53_format): Likewise.
(ibm_extended_format): Likewise.
(mips_extended_format): Likewise.
(ieee_quad_format): Likewise.
(mips_quad_format): Likewise.
(vax_f_format): Likewise.
(vax_d_format): Likewise.
(vax_g_format): Likewise.
(decimal_single_format): Likewise.
(decimal_quad_format): Likewise.
(iee_half_format): Likewise.
(mips_single_format): Likewise.
(arm_half_format): Likewise.
(real_internal_format): Likewise.


gcc/testsuite/
2016-09-27  Tamar Christina  

* gcc.target/aarch64/builtin-fpclassify.c: New codegen test.diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9a19a75cc8ed6edb5f543cd7bd26bcc0693e6ebb..1b4878c5ba098dcc0a4a506dbc7959d150cc9028 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7943,10 +7943,8 @@ static tree
 fold_builtin_fpclassify (location_t loc, tree *args, int nargs)
 {
   tree fp_nan, fp_infinite, fp_normal, fp_subnormal, fp_zero,
-arg, type, res, tmp;
+arg, type, res;
   machine_mode mode;
-  REAL_VALUE_TYPE r;
-  char buf[128];
 
   /* Verify the required arguments in the original call.  */
   if (nargs != 6
@@ -7966,14 +7964,164 @@ fold_builtin_fpclassify (location_t loc, tree *args, int nargs)
   arg = args[5];
   type = TREE_TYPE (arg);
   mode = TYPE_MODE (type);
-  arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
+  const real_format *format = REAL_MODE_FORMAT (mode);
+  const HOST_WIDE_INT type_width = TYPE_PRECISION (type);
+
+  /*
+  For IEEE 754 types:
+
+  fpclassify (x) ->
+   !((exp + 1) & (exp_mask & ~1)) // exponent bits not all set or unset
+	 ? (x & sign_mask == 0 ? FP_ZERO :
+	   (exp & exp_mask == exp_mask
+

Re: [PATCH] Reduce stack usage in sha512 (PR target/77308)

2016-09-30 Thread Bernd Edlinger
On 09/30/16 12:14, Bernd Edlinger wrote:
> Eric Botcazou wrote:
>>> A comment before the SETs and a testcase would be nice.  IIRC
>>> we do have stack size testcases via using -fstack-usage.
>>
>> Or -Wstack-usage, which might be more appropriate here.
>
> Yes.  good idea.  I was not aware that we already have that kind of tests.
>
> When trying to write this test. I noticed, that I did not try -Os so far.
> But for -Os the stack is still the unchanged 3500 bytes.
>
> However for embedded targets I am often inclined to use -Os, and
> would certainly not expect the stack to explode...
>
> I see in arm.md there are places like
>
>/* If we're optimizing for size, we prefer the libgcc calls.  */
>if (optimize_function_for_size_p (cfun))
>  FAIL;
>

Oh, yeah.  The comment is completely misleading.

If this pattern fails, expmed.c simply expands some
less efficient rtl, which also results in two shifts
and one or-op.  No libgcc calls at all.

So in simple cases without spilling the resulting
assembler is the same, regardless if this pattern
fails or not.  But the half-defined out registers
make a big difference when it has to be spilled.

>/* Expand operation using core-registers.
>   'FAIL' would achieve the same thing, but this is a bit smarter.  */
>scratch1 = gen_reg_rtx (SImode);
>scratch2 = gen_reg_rtx (SImode);
>arm_emit_coreregs_64bit_shift (LSHIFTRT, operands[0], operands[1],
>   operands[2], scratch1, scratch2);
>
>
> .. that explains why this happens.  I think it would be better to
> use the emit_coreregs for shift count >= 32, because these are
> effectively 32-bit shifts.
>
> Will try if that can be improved, and come back with the
> results.
>

The test case with -Os has 3520 bytes stack usage.
When only shift count >= 32 are handled we
have still 3000 bytes stack usage.
And when arm_emit_coreregs_64bit_shift is always
allowed to run, we have 2360 bytes stack usage.

Also for the code size it is better not to fail this
pattern.  So I propose to remove this exception in all
three expansions.

Here is an improved patch with the test case from the PR.
And a comment on the redundant SET why it is better to clear
the out register first.


Bootstrap and reg-testing on arm-linux-gnueabihf.
Is it OK for trunk?


Thanks
Bernd.
2016-09-29  Bernd Edlinger  

	PR target/77308
	* config/arm/arm.c (arm_emit_coreregs_64bit_shift): Clear the result
	register explicitly.
	* config/arm/arm.md (ashldi3, ashrdi3, lshrdi3): Don't FAIL if
	optimizing for size.

testsuite:
2016-09-29  Bernd Edlinger  

	PR target/77308
	* gcc.target/arm/pr77308.c: New test.


Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c	(revision 240645)
+++ gcc/config/arm/arm.c	(working copy)
@@ -29226,6 +29226,10 @@ arm_emit_coreregs_64bit_shift (enum rtx_code code,
 	  /* Shifts by a constant less than 32.  */
 	  rtx reverse_amount = GEN_INT (32 - INTVAL (amount));
 
+	  /* Clearing the out register in DImode first avoids lots
+	 of spilling and results in less stack usage.
+	 Later this redundant insn is completely removed.  */
+	  emit_insn (SET (out, const0_rtx));
 	  emit_insn (SET (out_down, LSHIFT (code, in_down, amount)));
 	  emit_insn (SET (out_down,
 			  ORR (REV_LSHIFT (code, in_up, reverse_amount),
@@ -29237,12 +29241,11 @@ arm_emit_coreregs_64bit_shift (enum rtx_code code,
 	  /* Shifts by a constant greater than 31.  */
 	  rtx adj_amount = GEN_INT (INTVAL (amount) - 32);
 
+	  emit_insn (SET (out, const0_rtx));
 	  emit_insn (SET (out_down, SHIFT (code, in_up, adj_amount)));
 	  if (code == ASHIFTRT)
 	emit_insn (gen_ashrsi3 (out_up, in_up,
 GEN_INT (31)));
-	  else
-	emit_insn (SET (out_up, const0_rtx));
 	}
 }
   else
Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md	(revision 240645)
+++ gcc/config/arm/arm.md	(working copy)
@@ -4016,10 +4016,6 @@
  cheaper to have the alternate code being generated than moving
  values to iwmmxt regs and back.  */
 
-  /* If we're optimizing for size, we prefer the libgcc calls.  */
-  if (optimize_function_for_size_p (cfun))
-	FAIL;
-
   /* Expand operation using core-registers.
 	 'FAIL' would achieve the same thing, but this is a bit smarter.  */
   scratch1 = gen_reg_rtx (SImode);
@@ -4089,10 +4085,6 @@
  cheaper to have the alternate code being generated than moving
  values to iwmmxt regs and back.  */
 
-  /* If we're optimizing for size, we prefer the libgcc calls.  */
-  if (optimize_function_for_size_p (cfun))
-	FAIL;
-
   /* Expand operation using core-registers.
 	 'FAIL' would achieve the same thing, but this is a bit smarter.  */
   scratch1 = gen_reg_rtx (SImode);
@@ -4159,10 +4151,6 @@
  cheaper to have the alternate code being generated th

[PATCH] Remove use of std::abs in experimental::{gcd,lcm}

2016-09-30 Thread Jonathan Wakely

std::abs isn't suitable for these functions because we need something
that works in constexpr functions and with unsigned types.

Also,  is supposed to include .

PR libstdc++/77801
* include/experimental/numeric: Include .
(__abs): Define.
(gcd, lcm): Use __abs instead of std::abs.
* testsuite/experimental/numeric/77801.cc: New test.
* testsuite/experimental/numeric/gcd.cc: Test unsigned inputs.
* testsuite/experimental/numeric/lcm.cc: Likewise.

Tested powerpc64le-linux, committed to trunk. Backport to gcc-6
coming too.

commit 49396c6b134c5994e6a7e5dff766dc8cf1290dc1
Author: Jonathan Wakely 
Date:   Fri Sep 30 14:20:10 2016 +0100

Remove use of std::abs in experimental::{gcd,lcm}

PR libstdc++/77801
* include/experimental/numeric: Include .
(__abs): Define.
(gcd, lcm): Use __abs instead of std::abs.
* testsuite/experimental/numeric/77801.cc: New test.
* testsuite/experimental/numeric/gcd.cc: Test unsigned inputs.
* testsuite/experimental/numeric/lcm.cc: Likewise.

diff --git a/libstdc++-v3/include/experimental/numeric 
b/libstdc++-v3/include/experimental/numeric
index 21878f3..5089772 100644
--- a/libstdc++-v3/include/experimental/numeric
+++ b/libstdc++-v3/include/experimental/numeric
@@ -39,8 +39,8 @@
 # include 
 #else
 
+#include 
 #include 
-#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -52,6 +52,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #define __cpp_lib_experimental_gcd_lcm 201411
 
+  // std::abs is not constexpr and doesn't support unsigned integers.
+  template
+constexpr
+enable_if_t<__and_, is_signed<_Tp>>::value, _Tp>
+__abs(_Tp __val)
+{ return __val < 0 ? -__val : __val; }
+
+  template
+constexpr
+enable_if_t<__and_, is_unsigned<_Tp>>::value, _Tp>
+__abs(_Tp __val)
+{ return __val; }
+
   // Greatest common divisor
   template
 constexpr common_type_t<_Mn, _Nn>
@@ -60,8 +73,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(is_integral<_Mn>::value, "arguments to gcd are integers");
   static_assert(is_integral<_Nn>::value, "arguments to gcd are integers");
 
-  return __m == 0 ? std::abs(__n)
-   : __n == 0 ? std::abs(__m)
+  return __m == 0 ? fundamentals_v2::__abs(__n)
+   : __n == 0 ? fundamentals_v2::__abs(__m)
: fundamentals_v2::gcd(__n, __m % __n);
 }
 
@@ -74,7 +87,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   static_assert(is_integral<_Nn>::value, "arguments to lcm are integers");
 
   return (__m != 0 && __n != 0)
-   ? (std::abs(__m) / fundamentals_v2::gcd(__m, __n)) * std::abs(__n)
+   ? (fundamentals_v2::__abs(__m) / fundamentals_v2::gcd(__m, __n))
+ * fundamentals_v2::__abs(__n)
: 0;
 }
 
diff --git a/libstdc++-v3/testsuite/experimental/numeric/77801.cc 
b/libstdc++-v3/testsuite/experimental/numeric/77801.cc
new file mode 100644
index 000..c4c8bfb
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/numeric/77801.cc
@@ -0,0 +1,22 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++14 } }
+
+#include 
+#include 
+constexpr int i = std::experimental::gcd(4L, 5L); // PR libstdc++/77801
diff --git a/libstdc++-v3/testsuite/experimental/numeric/gcd.cc 
b/libstdc++-v3/testsuite/experimental/numeric/gcd.cc
index 038f12d..b3345dc 100644
--- a/libstdc++-v3/testsuite/experimental/numeric/gcd.cc
+++ b/libstdc++-v3/testsuite/experimental/numeric/gcd.cc
@@ -25,3 +25,7 @@ static_assert(lcm(21, 6) == 42, "");
 static_assert(lcm(41, 0) == 0, "LCD with zero is zero");
 static_assert(lcm(0, 7) == 0, "LCD with zero is zero");
 static_assert(lcm(0, 0) == 0, "no division by zero");
+
+static_assert(lcm(1u, 2) == 2, "unsigned and signed");
+static_assert(lcm(3, 4u) == 12, "signed and unsigned");
+static_assert(lcm(5u, 6u) == 30, "unsigned and unsigned");
diff --git a/libstdc++-v3/testsuite/experimental/numeric/lcm.cc 
b/libstdc++-v3/testsuite/experimental/numeric/lcm.cc
index 2c969b0..d90c152 100644
--- a/libstdc++-v3/testsuite/experimental/numeric/lcm.cc
+++ b/libstdc++-v3/testsuite/experimental/numeric/lcm.cc
@@ -29,3 +29,6 @@ static_assert( gcd(0, 13) == 13, "GCD of any number and 0 is 
that number" );
 

libgo patch committed: Copy locking code from Go 1.7 master library

2016-09-30 Thread Ian Lance Taylor
This patch to libgo copies the locking code from the Go 1.7 master
library, replacing C code for locking with Go code.  The result is
more efficient on most systems, though it is essentially the same on
GNU/Linux.

Add a shell script mkrsysinfo.sh to generate the runtime_sysinfo.go
file, so that we can get Go copies of the system time structures and
other types.

As part of this change, tweak the compiler so that when compiling the
runtime package the address operator does not cause local variables to
escape.  When the gc compiler compiles the runtime, an escaping local
variable is treated as an error.  We should implement that, instead of
this change, when escape analysis is turned on.  This is for
correctness: there are places in the runtime where we must not
allocate memory unintentionally.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu and
i386-sun-solaris.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 240609)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-e51657a576367c7a498c94baf985b79066fc082a
+f3fb9bf2d5a009a707962a416fcd1a8435756218
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 240453)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -3787,6 +3787,13 @@ Unary_expression::do_flatten(Gogo* gogo,
   if ((n->encoding() & ESCAPE_MASK) == int(Node::ESCAPE_NONE))
this->escapes_ = false;
 
+  // When compiling the runtime, the address operator does not
+  // cause local variables to escapes.  When escape analysis
+  // becomes the default, this should be changed to make it an
+  // error if we have an address operator that escapes.
+  if (gogo->compiling_runtime() && gogo->package_name() == "runtime")
+   this->escapes_ = false;
+
   Named_object* var = NULL;
   if (this->expr_->var_expression() != NULL)
var = this->expr_->var_expression()->named_object();
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 240559)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -4480,6 +4480,19 @@ Gogo::write_c_header()
++p)
 {
   Named_object* no = *p;
+
+  // Skip names that start with underscore followed by something
+  // other than an uppercase letter, as when compiling the runtime
+  // package they are mostly types defined by mkrsysinfo.sh based
+  // on the C system header files.  We don't need to translate
+  // types to C and back to Go.  But do accept the special cases
+  // _defer and _panic.
+  std::string name = Gogo::unpack_hidden_name(no->name());
+  if (name[0] == '_'
+ && (name[1] < 'A' || name[1] > 'Z')
+ && (name != "_defer" && name != "_panic"))
+   continue;
+
   if (no->is_type() && no->type_value()->struct_type() != NULL)
types.push_back(no);
   if (no->is_const() && no->const_value()->type()->integer_type() != NULL)
Index: libgo/Makefile.am
===
--- libgo/Makefile.am   (revision 240588)
+++ libgo/Makefile.am   (working copy)
@@ -396,9 +396,9 @@ rtems_task_variable_add_file =
 endif
 
 if LIBGO_IS_LINUX
-runtime_lock_files = runtime/lock_futex.c runtime/thread-linux.c
+runtime_thread_files = runtime/thread-linux.c
 else
-runtime_lock_files = runtime/lock_sema.c runtime/thread-sema.c
+runtime_thread_files = runtime/thread-sema.c
 endif
 
 if LIBGO_IS_LINUX
@@ -502,7 +502,6 @@ runtime_files = \
runtime/go-varargs.c \
runtime/env_posix.c \
runtime/heapdump.c \
-   $(runtime_lock_files) \
runtime/mcache.c \
runtime/mcentral.c \
$(runtime_mem_file) \
@@ -518,6 +517,7 @@ runtime_files = \
runtime/runtime.c \
runtime/signal_unix.c \
runtime/thread.c \
+   $(runtime_thread_files) \
runtime/yield.c \
$(rtems_task_variable_add_file) \
chan.c \
@@ -633,12 +633,8 @@ s-version: Makefile
$(STAMP) $@
 
 runtime_sysinfo.go: s-runtime_sysinfo; @true
-s-runtime_sysinfo: sysinfo.go
-   rm -f tmp-runtime_sysinfo.go
-   echo 'package runtime' > tmp-runtime_sysinfo.go
-   echo >> tmp-runtime_sysinfo.go
-   grep 'const _sizeof_ucontext_t ' sysinfo.go >> tmp-runtime_sysinfo.go
-   grep 'type _sigset_t ' sysinfo.go >> tmp-runtime_sysinfo.go
+s-runtime_sysinfo: $(srcdir)/mkrsysinfo.sh gen-sysinfo.go
+   $(SHELL) $(srcdir)/mkrsysinfo.sh
$(SHELL) $(srcdir)/mvifdiff.sh tmp-runtime_sysinfo.go runtime_sysinfo.go
$(STAMP) $@
 
Index: libgo/configure.ac
===

Re:[PATCH] [ARC] Disable compact casesi patterns for arcv2

2016-09-30 Thread Claudiu Zissulescu
Please find the updated patch,
Claudiu

gcc/
2016-05-09  Claudiu Zissulescu  

* common/config/arc/arc-common.c (arc_option_optimization_table):
Remove compact casesi option.
* config/arc/arc.c (arc_override_options): Use compact casesi
option only for pre-ARCv2 cores.
* doc/invoke.texi (mcompact-casesi): Update text.
---
 gcc/common/config/arc/arc-common.c |  1 -
 gcc/config/arc/arc.c   | 11 +++
 gcc/doc/invoke.texi|  4 ++--
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/gcc/common/config/arc/arc-common.c 
b/gcc/common/config/arc/arc-common.c
index f5b9c6d..5b687fb 100644
--- a/gcc/common/config/arc/arc-common.c
+++ b/gcc/common/config/arc/arc-common.c
@@ -56,7 +56,6 @@ static const struct default_options 
arc_option_optimization_table[] =
 { OPT_LEVELS_ALL, OPT_mbbit_peephole, NULL, 1 },
 { OPT_LEVELS_SIZE, OPT_mq_class, NULL, 1 },
 { OPT_LEVELS_SIZE, OPT_mcase_vector_pcrel, NULL, 1 },
-{ OPT_LEVELS_SIZE, OPT_mcompact_casesi, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 2b25b0b..5e8d6b4 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -858,6 +858,17 @@ arc_override_options (void)
   if (arc_size_opt_level == 3)
 optimize_size = 1;
 
+  /* Compact casesi is not a valid option for ARCv2 family.  */
+  if (TARGET_V2
+  && TARGET_COMPACT_CASESI)
+{
+  warning (0, "compact-casesi is not applicable to ARCv2");
+  TARGET_COMPACT_CASESI = 0;
+}
+  else if (optimize_size == 1
+  && !global_options_set.x_TARGET_COMPACT_CASESI)
+TARGET_COMPACT_CASESI = 1;
+
   if (flag_pic)
 target_flags |= MASK_NO_SDATA_SET;
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6767462..05f565d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14255,8 +14255,8 @@ This is the default for @option{-Os}.
 
 @item -mcompact-casesi
 @opindex mcompact-casesi
-Enable compact casesi pattern.
-This is the default for @option{-Os}.
+Enable compact casesi pattern.  This is the default for @option{-Os},
+and only available for ARCv1 cores.
 
 @item -mno-cond-exec
 @opindex mno-cond-exec
-- 
1.9.1



[gomp4] remove some tile tests

2016-09-30 Thread Nathan Sidwell
In implementing tile I discovered two existing runtime tile tests.  These were 
only passing because tile was completely ignored.  One of them exposes a latent 
bug in collapse, in my WIP, but I don't want to get distracted by that right now.


Better to have tile tests test tile tiles.

nathan
2016-09-30  Nathan Sidwell  

	* testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.c: Remove
	tile test.
	* testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.h: Likewise.
	* testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Likewise.

Index: testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.c
===
--- testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.c	(revision 240654)
+++ testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.c	(working copy)
@@ -29,12 +29,10 @@ main ()
   || test_none_auto ()
   || test_none_independent ()
   || test_none_seq ()
-  || test_none_tile ()
   || test_gangs_none ()
   || test_gangs_auto ()
   || test_gangs_independent ()
-  || test_gangs_seq ()
-  || test_gangs_tile ())
+  || test_gangs_seq ())
 abort ();
   return 0;
 }
Index: testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.h
===
--- testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.h	(revision 240654)
+++ testsuite/libgomp.oacc-c-c++-common/parallel-loop-1.h	(working copy)
@@ -18,8 +18,3 @@
 #include "parallel-loop-2.h"
 #undef S
 #undef N
-#define S tile(*)
-#define N(x) M(x, G, tile)
-#include "parallel-loop-2.h"
-#undef S
-#undef N
Index: testsuite/libgomp.oacc-fortran/kernels-loop-1.f90
===
--- testsuite/libgomp.oacc-fortran/kernels-loop-1.f90	(revision 240654)
+++ testsuite/libgomp.oacc-fortran/kernels-loop-1.f90	(working copy)
@@ -54,17 +54,6 @@ program loops
 
   call check (a, b, n)
 
-  ! PRESENT_OR_COPY
-
-  !$acc kernels pcopy (a)
-  !$acc loop tile (*)
-  do i = 1, n
- a(i) = i
-  end do
-  !$acc end kernels
-
-  call check (a, b, n)
-
 end program loops
 
 subroutine check (a, b, n)


Re: [PATCH][v4] GIMPLE store merging pass

2016-09-30 Thread Kyrill Tkachov

Hi Richard,

On 29/09/16 11:45, Richard Biener wrote:



+
+ /* In some cases get_inner_reference may return a
+MEM_REF [ptr + byteoffset].  For the purposes of this pass
+canonicalize the base_addr to MEM_REF [ptr] and take
+byteoffset into account in the bitpos.  This occurs in
+PR 23684 and this way we can catch more chains.  */
+ if (TREE_CODE (base_addr) == MEM_REF
+ && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (base_addr, 0)))
+ && TREE_CODE (TREE_OPERAND (base_addr, 1)) == INTEGER_CST

This is always an INTEGER_CST.


+ && tree_fits_shwi_p (TREE_OPERAND (base_addr, 1))

This will never allow negative offsets (but maybe this is a good thing?)

)

+   {
+ bitpos += tree_to_shwi (TREE_OPERAND (base_addr, 1))
+   * BITS_PER_UNIT;

this multiplication may overflow.  There is mem_ref_offset () which
you should really use here, see get_inner_reference itself (and
how to translate back from offset_int to HOST_WIDE_INT if it fits).


+
+ base_addr = fold_build2 (MEM_REF, TREE_TYPE (base_addr),
+  TREE_OPERAND (base_addr, 0),
+  build_zero_cst (TREE_TYPE (
+TREE_OPERAND (base_addr, 1;

Ugh, building a tree node ... you could use TREE_OPERAND (base_addr, 0)
as base_addr instead?


This didn't work for me because aliasing info was lost.
So in the example:
void
foo2 (struct bar *p, struct bar *p2)
{
  p->b = 0xff;
  p2->b = 0xa;
  p->a = 0xf;
  p2->c = 0xc;
  p->c = 0xff;
  p2->d = 0xbf;
  p->d = 0xfff;
}

we end up merging p->b with p->a even though the p2->b store may alias.
We'll record the base objects as being 'p' and 'p2' whereas with my approach
we record them as '*p' and '*p2'. I don't suppose I could just do:
TREE_OPERAND (base_addr, 1) = build_zero_cst (TREE_TYPE (TREE_OPERAND 
(base_addr, 1)));
?

Thanks,
Kyrill




+   }
+
+ struct imm_store_chain_info **chain_info
+   = m_stores.get (base_addr);
+
+ if (!invalid)
+   {
+ store_immediate_info *info;
+ if (chain_info)
+   {
+ info = new store_immediate_info (
+   bitsize, bitpos, rhs, lhs, stmt,
+   (*chain_info)->m_store_info.length ());
+ if (dump_file)
+   {
+ fprintf (dump_file,
+  "Recording immediate store from stmt:\n");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+   }
+ (*chain_info)->m_store_info.safe_push (info);
+ continue;
+   }
+
+ /* Store aliases any existing chain?  */
+ terminate_all_aliasing_chains (lhs, base_addr, stmt);
+
+ /* Start a new chain.  */
+ struct imm_store_chain_info *new_chain
+   = new imm_store_chain_info;
+ info = new store_immediate_info (bitsize, bitpos, rhs, lhs,
+  stmt, 0);
+ new_chain->m_store_info.safe_push (info);
+ m_stores.put (base_addr, new_chain);
+ if (dump_file)
+   {
+ fprintf (dump_file,
+  "Starting new chain with statement:\n");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ fprintf (dump_file, "The base object is:\n");
+ print_generic_expr (dump_file, base_addr, 0);
+ fprintf (dump_file, "\n");
+   }
+   }
+ else
+   terminate_all_aliasing_chains (lhs, base_addr, stmt);
+
+ continue;
+   }
+
+ terminate_all_aliasing_chains (NULL_TREE, NULL_TREE, stmt);
+   }
+  terminate_and_process_all_chains (bb);
+}
+  return 0;
+}
+
+} // anon namespace
+
+/* Construct and return a store merging pass object.  */
+
+gimple_opt_pass *
+make_pass_store_merging (gcc::context *ctxt)
+{
+  return new pass_store_merging (ctxt);
+}
diff --git a/gcc/opts.c b/gcc/opts.c
index 45f1f89c..e63d7e4 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -463,6 +463,7 @@ static const struct default_options default_options_table[] 
=
  { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
  { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
+{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fstore_merging, NULL, 1 },

Please leave it to -O[2s]+ -- the chain invalidation is quadratic and
-O1 should work well even for gigantic basic blocks.

Overall the pass looks quite well with the 

Re: [PATCH][v4] GIMPLE store merging pass

2016-09-30 Thread Kyrill Tkachov


On 30/09/16 15:36, Kyrill Tkachov wrote:

Hi Richard,

On 29/09/16 11:45, Richard Biener wrote:



+
+  /* In some cases get_inner_reference may return a
+ MEM_REF [ptr + byteoffset].  For the purposes of this pass
+ canonicalize the base_addr to MEM_REF [ptr] and take
+ byteoffset into account in the bitpos.  This occurs in
+ PR 23684 and this way we can catch more chains.  */
+  if (TREE_CODE (base_addr) == MEM_REF
+  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (base_addr, 0)))
+  && TREE_CODE (TREE_OPERAND (base_addr, 1)) == INTEGER_CST

This is always an INTEGER_CST.


+  && tree_fits_shwi_p (TREE_OPERAND (base_addr, 1))

This will never allow negative offsets (but maybe this is a good thing?)

)

+{
+  bitpos += tree_to_shwi (TREE_OPERAND (base_addr, 1))
+* BITS_PER_UNIT;

this multiplication may overflow.  There is mem_ref_offset () which
you should really use here, see get_inner_reference itself (and
how to translate back from offset_int to HOST_WIDE_INT if it fits).


+
+  base_addr = fold_build2 (MEM_REF, TREE_TYPE (base_addr),
+   TREE_OPERAND (base_addr, 0),
+   build_zero_cst (TREE_TYPE (
+ TREE_OPERAND (base_addr, 1;

Ugh, building a tree node ... you could use TREE_OPERAND (base_addr, 0)
as base_addr instead?


This didn't work for me because aliasing info was lost.
So in the example:
void
foo2 (struct bar *p, struct bar *p2)
{
  p->b = 0xff;
  p2->b = 0xa;
  p->a = 0xf;
  p2->c = 0xc;
  p->c = 0xff;
  p2->d = 0xbf;
  p->d = 0xfff;
}

we end up merging p->b with p->a even though the p2->b store may alias.
We'll record the base objects as being 'p' and 'p2' whereas with my approach
we record them as '*p' and '*p2'. I don't suppose I could just do:
TREE_OPERAND (base_addr, 1) = build_zero_cst (TREE_TYPE (TREE_OPERAND 
(base_addr, 1)));
?



Although I think I could try to make it work by using ptr_derefs_may_alias_p in 
the alias checks
a bit more. I'll see what I can do.

Kyrill


Thanks,
Kyrill




+}
+
+  struct imm_store_chain_info **chain_info
+= m_stores.get (base_addr);
+
+  if (!invalid)
+{
+  store_immediate_info *info;
+  if (chain_info)
+{
+  info = new store_immediate_info (
+bitsize, bitpos, rhs, lhs, stmt,
+(*chain_info)->m_store_info.length ());
+  if (dump_file)
+{
+  fprintf (dump_file,
+   "Recording immediate store from stmt:\n");
+  print_gimple_stmt (dump_file, stmt, 0, 0);
+}
+  (*chain_info)->m_store_info.safe_push (info);
+  continue;
+}
+
+  /* Store aliases any existing chain?  */
+  terminate_all_aliasing_chains (lhs, base_addr, stmt);
+
+  /* Start a new chain.  */
+  struct imm_store_chain_info *new_chain
+= new imm_store_chain_info;
+  info = new store_immediate_info (bitsize, bitpos, rhs, lhs,
+   stmt, 0);
+  new_chain->m_store_info.safe_push (info);
+  m_stores.put (base_addr, new_chain);
+  if (dump_file)
+{
+  fprintf (dump_file,
+   "Starting new chain with statement:\n");
+  print_gimple_stmt (dump_file, stmt, 0, 0);
+  fprintf (dump_file, "The base object is:\n");
+  print_generic_expr (dump_file, base_addr, 0);
+  fprintf (dump_file, "\n");
+}
+}
+  else
+terminate_all_aliasing_chains (lhs, base_addr, stmt);
+
+  continue;
+}
+
+  terminate_all_aliasing_chains (NULL_TREE, NULL_TREE, stmt);
+}
+  terminate_and_process_all_chains (bb);
+}
+  return 0;
+}
+
+} // anon namespace
+
+/* Construct and return a store merging pass object.  */
+
+gimple_opt_pass *
+make_pass_store_merging (gcc::context *ctxt)
+{
+  return new pass_store_merging (ctxt);
+}
diff --git a/gcc/opts.c b/gcc/opts.c
index 45f1f89c..e63d7e4 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -463,6 +463,7 @@ static const struct default_options default_options_table[] 
=
  { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 },
  { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 },
+{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fstore_merging, NULL, 1 },

Please leave it to -O[2s]+ -- the chain invalidation is quadratic and
-O1 should work well even for gigantic basic blocks.

Overall the pass looks quite well with the comments addressed.

Thanks,
Richard.


  { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 },
  { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 },
diff --git a/gcc/params.def b/gcc/params.def
index 8907aa4..e63e594 100644
--- a/gcc/par

Re: [PATCH 2/2] Extend -falign-FOO=N to N[,M[,N2[,M2]]]

2016-09-30 Thread Denys Vlasenko

On 09/30/2016 01:20 PM, Bernd Schmidt wrote:

On 09/29/2016 07:32 PM, Denys Vlasenko wrote:

On 09/29/2016 04:45 PM, Bernd Schmidt wrote:

On 09/28/2016 02:57 PM, Denys Vlasenko wrote:

-  /* Comes from final.c -- no real reason to change it.  */
-#define MAX_CODE_ALIGN 16
-
 case OPT_malign_loops_:
   warning_at (loc, 0, "-malign-loops is obsolete, use
-falign-loops");
-  if (value > MAX_CODE_ALIGN)
-error_at (loc, "-malign-loops=%d is not between 0 and %d",
-  value, MAX_CODE_ALIGN);
-  else
-opts->x_align_loops = 1 << value;
   return true;


That does seem to be a functional change. I'll defer to Uros.


It would be awkward to translate -malign-loops=%d et al
to comma-separated string format.
Since this warning is there for some 15 years already,
anyone who actually cares should have converted to new options
long ago.


Hmm, if it's been 15 years, maybe it's time to remove these. Could you submit a 
patch separately?


Sure.


-  if (opts->x_align_functions <= 0)
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)


Are these conditions really equivalent? It looks like zero was
the default even when no -falign-functions was specified.
 Or is that overriden by init_alignments?


 {
-  if (opts->x_align_loops == 0)
+  /* -falign-foo without argument: supply one */
+  if (opts->x_flag_align_loops && !opts->x_str_align_loops)


Same here.


The execution flow for option parsing is somewhat convoluted, no doubt.

I found it experimentally that these are locations where
default alignment parameters are set when -falign-functions
is given with no arguments (or when it is implied by -O2).


I applied your latest two patches to experiment with them, and I see different
 behaviour before and after on x86_64-linux. There seems to be a difference
 in function alignment and label alignment at -O2.


Let me try harder, I was only checking -ffunction-alignment...

My test program:

int g();
int f(int i)
{
i *= 3;
while (--i > 100) {
 L1:
if (g())
goto L1;
if (g())
goto L2;
}
return i;
 L2:
return 123;
}

Before-and-after "gcc -O2 -S" assembly (after the patch is on the right):

.text   .text
.p2align 4,,15  .p2align 4,,15
.globl  f   .p2align 3
.type   f, @function.globl  f
f:  .type   f, @function
.LFB0:  f:
.cfi_startproc  .LFB0:
pushq   %rbx.cfi_startproc
.cfi_def_cfa_offset 16  pushq   %rbx
.cfi_offset 3, -16  .cfi_def_cfa_offset 16
leal(%rdi,%rdi,2), %ebx .cfi_offset 3, -16
.p2align 4,,10  leal(%rdi,%rdi,2), %ebx
.p2align 3  .p2align 4,,10
.L2:.L2:
subl$1, %ebxsubl$1, %ebx
cmpl$100, %ebx  cmpl$100, %ebx
jle .L1 jle .L1
.p2align 4,,10  .p2align 4,,10
.p2align 3  .L3:
.L3:xorl%eax, %eax
xorl%eax, %eax  callg
callg   testl   %eax, %eax
testl   %eax, %eax  jne .L3
jne .L3 callg
callg   testl   %eax, %eax
testl   %eax, %eax  je  .L2
je  .L2 movl$123, %ebx
movl$123, %ebx  .L4:
.L4:.L1:
.L1:movl%ebx, %eax
movl%ebx, %eax  popq%rbx
popq%rbx.cfi_def_cfa_offset 8
.cfi_def_cfa_offset 8   ret
ret .cfi_endproc
.cfi_endproc.LFE0:
.LFE0:


Yes, I see differences. ".p2align 3" appeared in function alignment.
The reason is that old code had an optimization - it noticed that
".p2align 4,,15" _always_ aligns (because 2^4=15+1), thus ".p2align 3"
is superfluous. My patch doesn't do that. I fixed this already
in the next version of the patch I'm going to send.

The other difference is that ".p2align 4,,10" is no longer followed
by ".p2align 3". Well... this one is harder to make happen.
It comes from here in gcc/final.c:

case CODE_LABEL:
  /* The target port might emit labels in the output function for
 some insn, e.g. sh.c output_branchy

Re: [PATCH] Delete GCJ

2016-09-30 Thread Andrew Haley
On 05/09/16 17:25, Gerald Pfeifer wrote:
> And here is the patch for the web pages.
> 
> Note I did not include all the removed java/* contents.  Is there
> anything particular you'd like to retain there?

No, please delete it all.

Thanks,

Andrew.



Re: Use version namespace in normal mode

2016-09-30 Thread Jonathan Wakely

On 29/09/16 21:59 +0200, François Dumont wrote:

Hi

   I think _GLIBCXX_BEGIN_NAMESPACE_ALGO should default to 
_GLIBCXX_BEGIN_NAMESPACE_VERSION when parallel mode is not active. 
Same for _GLIBCXX_BEGIN_NAMESPACE_CONTAINER, no ?


Hmm, yes, I think this is correct, otherwise we're missing the VERSION
namespace in normal mode. But it seems we've always been missing it
since those macros were introduced in GCC 4.6 so I'd like to
investigate the consequences for gnu-versioned-namespace more
carefuly.


   * include/bits/c++config (_GLIBCXX_BEGIN_NAMESPACE_ALGO)
   (_GLIBCXX_END_NAMESPACE_ALGO): Default to respectively
   _GLIBCXX_BEGIN_NAMESPACE_VERSION and _GLIBCXX_END_NAMESPACE_VERSION
   when parallel mode is not active.
   (_GLIBCXX_BEGIN_NAMESPACE_CONTAINER, _GLIBCXX_END_NAMESPACE_CONTAINER):
   Likewise.

   Ok to commit after normal check ? Should I rebuild library with 
versioned namespace activated ?


Any change affecting the NAMESPACE_VERSION macros should be tested
with the gnu-versioned-namespace, since those macros are only used for
that mode.

But it can't be tested currently, see PR 77794. So let's wait until
I've fixed 77794, when we can test it.



Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Martin Sebor

I haven't been following the discussion very closely so I may have
missed that what I'm about to suggest has been discussed and rejected
for some valid reason, but if not let me try.

It seems to me that the ultimate, long term goal should be to have
actively maintained code bases gradually migrate away from the
various fallthrough comments and to the new attribute.  Under that
premise, I think introducing a warning that's on the permissive end
of the spectrum (say outside of -Wall, or even outside of -Wextra,
and/or disabling the checker at the first sight of a comment) would
obviate the concern of needlessly breaking working code and let
users start adopting the warning on their own schedules.  The next
major release of GCC after 7 could increase the sensitivity of the
warning (e.g., by adding it -Wextra, and/or checking for the words
fall though in the comments, etc.), and the next one could make it
even more strict.  With GCC's one year release cycle this approach
would give even users who adopt the latest compiler two to three
years to make the transition.

Martin


Re: [PATCH] fixincludes: fix fixincludes for MinGW

2016-09-30 Thread Bruce Korb
Hi Tadek,

Looks good to me.  Thank you.
Clear to send (push).


Re: [PATCH] Fix -Wimplicit-fallthrough -C, handle some more comment styles and comments in between FALLTHRU comment and label

2016-09-30 Thread Jakub Jelinek
On Fri, Sep 30, 2016 at 10:10:55AM -0600, Martin Sebor wrote:
> I haven't been following the discussion very closely so I may have
> missed that what I'm about to suggest has been discussed and rejected
> for some valid reason, but if not let me try.
> 
> It seems to me that the ultimate, long term goal should be to have
> actively maintained code bases gradually migrate away from the
> various fallthrough comments and to the new attribute.  Under that
> premise, I think introducing a warning that's on the permissive end
> of the spectrum (say outside of -Wall, or even outside of -Wextra,
> and/or disabling the checker at the first sight of a comment) would
> obviate the concern of needlessly breaking working code and let
> users start adopting the warning on their own schedules.  The next
> major release of GCC after 7 could increase the sensitivity of the
> warning (e.g., by adding it -Wextra, and/or checking for the words
> fall though in the comments, etc.), and the next one could make it
> even more strict.  With GCC's one year release cycle this approach
> would give even users who adopt the latest compiler two to three
> years to make the transition.

That is IMHO a bad idea, because almost nobody will really use it and the
warning option will bitrot.  The reason why we've added parsing of the most
common comments was exactly to be able to enable it already in -Wextra.
clang has it outside of -Wextra and (almost) nobody really started adopting
the attributes.

The option to have different levels of -Wimplicit-fallthrough= has been
proposed, that would allow projects to choose if they only allow attributes
to disable the warning, or also some very strict set of comments, or more
relaxed set of comments, or perhaps any comment before the label.

Jakub


Move Per Bothner, Andrew Haley, and Tom Tromey to write-after approval after GCJ deletion

2016-09-30 Thread Andrew Haley
Pushed.

2016-09-30  Andrew Haley  

* MAINTAINERS: Move Per Bothner, Andrew Haley, and Tom Tromey to
write-after approval after GCJ deletion.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 240658)
+++ MAINTAINERS (working copy)
@@ -155,9 +155,6 @@
 c++Jason Merrill   
 c++Nathan Sidwell  
 go Ian Lance Taylor
-java   Per Bothner 
-java   Andrew Haley
-java   Tom Tromey  
 objective-c/c++Mike Stump  
 objective-c/c++Iain Sandoe 

@@ -352,6 +349,7 @@
 Andrea Bona
 Paolo Bonzini  
 Neil Booth 
+Per Bothner 
 Robert Bowdidge
 Joel Brobecker 
 Dave Brolley   
@@ -425,6 +423,7 @@
 Wei Guozhi 
 Mostafa Hagog  
 Olivier Hainque
+Andrew Haley   
 Stuart Hastings
 Michael Haubenwallner  

 Pat Haugen 
@@ -608,6 +607,7 @@
 Philipp Tomsich

 Konrad Trifunovic  
 Markus Trippelsdorf
+Tom Tromey  
 Martin Uecker  
 David Ung  
 Neil Vachharajani  


Re: [PATCH] fixincludes: fix fixincludes for MinGW

2016-09-30 Thread Jeff Law

On 09/30/2016 10:18 AM, Bruce Korb wrote:

Hi Tadek,

Looks good to me.  Thank you.
Clear to send (push).

Committed on behalf of Tadek.

Thanks,
jeff


Re: Move Per Bothner, Andrew Haley, and Tom Tromey to write-after approval after GCJ deletion

2016-09-30 Thread Rainer Orth
Hi Andrew,

> Pushed.
>
> 2016-09-30  Andrew Haley  
>
> * MAINTAINERS: Move Per Bothner, Andrew Haley, and Tom Tromey to
> write-after approval after GCJ deletion.

but both Per and Tom are still libcpp maintainers, so no need to add
them to the write-after-approval list.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Move Per Bothner, Andrew Haley, and Tom Tromey to write-after approval after GCJ deletion

2016-09-30 Thread Andrew Haley
On 30/09/16 17:38, Rainer Orth wrote:
> but both Per and Tom are still libcpp maintainers, so no need to add
> them to the write-after-approval list.

Ooh, I had no idea.  Will fix, thanks.

Andrew.



Re: C/C++ PATCH to implement -Wpointer-compare warning (PR c++/64767)

2016-09-30 Thread Marek Polacek
On Fri, Sep 23, 2016 at 10:31:33AM -0400, Jason Merrill wrote:
> On Fri, Sep 23, 2016 at 9:15 AM, Marek Polacek  wrote:
> > On Wed, Sep 21, 2016 at 03:52:09PM -0400, Jason Merrill wrote:
> >> On Mon, Sep 19, 2016 at 2:49 PM, Jason Merrill  wrote:
> >> > I suppose that an INTEGER_CST of character type is necessarily a
> >> > character constant, so adding a check for !char_type_p ought to do the
> >> > trick.
> >>
> >> Indeed it does.  I'm checking this in:
> >
> > Nice, thanks.  What about the original patch?  We still need to warn
> > (or error for C++11) for pointer comparisons.
> 
> If we still accept pointer comparisons in C++, that's another bug with
> treating \0 as a null pointer constant.  This seems to be because
> ocp_convert of \0 to int produces an INTEGER_CST indistinguishable
> from literal 0.

I was trying to fix this in ocp_convert, by using NOP_EXPRs, but that wasn't
successful.  But since we're interested in ==/!=, I think this can be fixed
easily in cp_build_binary_op.  Actually, all that seems to be needed is using
orig_op as the argument to null_ptr_cst_p, but that wouldn't give the correct
diagnostics, so I did this.  By checking orig_op we can see if the operands are
character literals or not, because orig_op is an operand before the default
conversions.

Curiously, nothing in the testsuite broke.

Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?

2016-09-30  Marek Polacek  

Core 903
* typeck.c (cp_build_binary_op) [EQ_EXPR]: Diagnose invalid pointer
conversions.

* g++.dg/cpp0x/nullptr37.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 617ca55..2e6f44e 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4584,6 +4584,14 @@ cp_build_binary_op (location_t location,
  else
result_type = type0;
 
+ if (null_ptr_cst_p (op1) && !null_ptr_cst_p (orig_op1))
+   {
+ if (complain & tf_error)
+   permerror (input_location, "ISO C++11 only allows pointer "
+  "conversions for integer literals");
+ else
+   return error_mark_node;
+   }
  warn_for_null_address (location, op0, complain);
}
   else if (((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1))
@@ -4598,6 +4606,14 @@ cp_build_binary_op (location_t location,
  else
result_type = type1;
 
+ if (null_ptr_cst_p (op0) && !null_ptr_cst_p (orig_op0))
+   {
+ if (complain & tf_error)
+   permerror (input_location, "ISO C++11 only allows pointer "
+  "conversions for integer literals");
+ else
+   return error_mark_node;
+   }
  warn_for_null_address (location, op1, complain);
}
   else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE)
diff --git gcc/testsuite/g++.dg/cpp0x/nullptr37.C 
gcc/testsuite/g++.dg/cpp0x/nullptr37.C
index e69de29..17c33d1 100644
--- gcc/testsuite/g++.dg/cpp0x/nullptr37.C
+++ gcc/testsuite/g++.dg/cpp0x/nullptr37.C
@@ -0,0 +1,78 @@
+/* PR c++/64767 */
+// { dg-do compile { target c++11 } }
+
+int
+f1 (int *p, int **q)
+{
+  int r = 0;
+
+  r += p == '\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer conversions 
for integer literals" } */
+  r += p == L'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += p == u'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += p == U'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += p != '\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer conversions 
for integer literals" } */
+  r += p != L'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += p != u'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += p != U'\0'; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+
+  r += '\0' == p; /* { dg-error "ISO C\\+\\+11 only allows pointer conversions 
for integer literals" } */
+  r += L'\0' == p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += u'\0' == p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += U'\0' == p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += '\0' != p; /* { dg-error "ISO C\\+\\+11 only allows pointer conversions 
for integer literals" } */
+  r += L'\0' != p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += u'\0' != p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+  r += U'\0' != p; /* { dg-error "ISO C\\+\\+11 only allows pointer 
conversions for integer literals" } */
+
+  r += q == '\0'; /* { dg-error "ISO C\\+\

Re: Fix missing FALLTHRU comments

2016-09-30 Thread Jan-Benedict Glaw
On Thu, 2016-09-29 18:01:46 +0200, Marek Polacek  wrote:
> My upcoming fix revealed more places that were missing a fall through marker.
> 
> Bootstrapped/regtested on x86_64-linux, ppc64-linux, and aarch64-linux-gnu 
> with
> my fix, applying to trunk.
> 
> 2016-09-29  Marek Polacek  
> 
>   * rtti.c (involves_incomplete_p): Add fall through comment.
> 
>   * dwarf2out.c (loc_descriptor): Add fall through comment.
>   (add_const_value_attribute): Likewise.

Maybe gcc/config/alpha/predicates.md:184, too?

/* ... fall through ...  */

This showed up on my build robot recently.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Friends are relatives you make for yourself.
the second  :


signature.asc
Description: Digital signature


Re: Fwd: [PATCH] gcc: Fix sysroot relative paths for MinGW

2016-09-30 Thread Jeff Law

On 09/29/2016 09:21 PM, Tadek Kijkowski wrote:

Can I have plain-text mode, please, gmail?

:-)  Only because you're asking nicely...




 # Directory in which the compiler finds libraries etc.
 libsubdir =
$(libdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
 # Directory in which the compiler finds executables
@@ -2751,14 +2763,14 @@
 PREPROCESSOR_DEFINES = \
   -DGCC_INCLUDE_DIR=\"$(libsubdir)/include\" \
   -DFIXED_INCLUDE_DIR=\"$(libsubdir)/include-fixed\" \
-  -DGPLUSPLUS_INCLUDE_DIR=\"$(gcc_gxx_include_dir)\" \
+  -DGPLUSPLUS_INCLUDE_DIR=\"$(call
sysroot_relative_path,$(gcc_gxx_include_dir),$(filter-out
0,$(gcc_gxx_include_dir_add_sysroot)))\" \


So why the $(filter-out 0, )?

I'd really like to avoid being too clever here and write this code in the
most straightforward way possible.



Hmm... that's partially leftover from the abandoned idea to pass
@TARGET_SYSTEM_ROOT@ as second parameter of sysroot_relative_path.
Sysroot is prepended to GPLUSPLUS_INCLUDE_DIR in runtime only if
$(gcc_gxx_include_dir) is 1.
Since sysroot_relative_path checks for non-empty string the easiest
way was to use filter-out. But I agree this way it's confusing.

How about if I change the sysroot_relative_path function to explicitly
check for 1? But still - since $(if) checks for empty string, it will
have to use filter or filter-out.

I think with the improved comments you showed in V2, that'd be fine.




N.B. I'd prefer to use backticks over "$()", but it could clash if
some include paths already contain backtick expressions.
So the concern is we might use backticks to get an evaluation of 
something at build time.  Conceptually one could even create a pathname 
with literal backticks, but I suspect somewhere, somehow that's going to 
fail independent of your change.


Jeff


Re: Fix missing FALLTHRU comments

2016-09-30 Thread Marek Polacek
On Fri, Sep 30, 2016 at 06:49:17PM +0200, Jan-Benedict Glaw wrote:
> On Thu, 2016-09-29 18:01:46 +0200, Marek Polacek  wrote:
> > My upcoming fix revealed more places that were missing a fall through 
> > marker.
> > 
> > Bootstrapped/regtested on x86_64-linux, ppc64-linux, and aarch64-linux-gnu 
> > with
> > my fix, applying to trunk.
> > 
> > 2016-09-29  Marek Polacek  
> > 
> > * rtti.c (involves_incomplete_p): Add fall through comment.
> > 
> > * dwarf2out.c (loc_descriptor): Add fall through comment.
> > (add_const_value_attribute): Likewise.
> 
> Maybe gcc/config/alpha/predicates.md:184, too?
> 
> /* ... fall through ...  */
> 
> This showed up on my build robot recently.

Thanks, fixed:

2016-09-30  Marek Polacek  

* config/alpha/predicates.md: Adjust fall through comment.

diff --git gcc/config/alpha/predicates.md gcc/config/alpha/predicates.md
index 24fa3c2..ca14fad 100644
--- gcc/config/alpha/predicates.md
+++ gcc/config/alpha/predicates.md
@@ -181,7 +181,7 @@
 case SUBREG:
   if (register_operand (op, mode))
return 1;
-  /* ... fall through ...  */
+  /* fall through */
 case MEM:
   return ((TARGET_BWX || (mode != HImode && mode != QImode))
  && general_operand (op, mode));


Marek


[Patch 0/11] Add support for _Float16 to AArch64

2016-09-30 Thread James Greenhalgh
Hi,

This patch set enables the _Float16 type specified in ISO/IEC TS 18661-3
for AArch64.

To do this, we first need to update the excess precision logic to support
possibly computing _Float16 values in 16-bit range and precision, and to
set the FLT_EVAL_METHOD macro to "16" as appropriate. That requires some
more expressiveness than we currently get with TARGET_FLT_EVAL_METHOD, so
we first need a new target hook - TARGET_C_EXCESS_PRECISION (patch 1/11),
which we implement in the i386, s390 and m68k back-ends in patches 2-4/11.

However, the meaning of the new value "16" for FLT_EVAL_METHOD is not
specified under C99/C11, and conforming code may have been written assuming
that the only possible values for FLT_EVAL_METHOD were negative values and
{ 0, 1, 2 }. In [patch 5/11] we work around that with a new option -
-fpermitted-flt-eval-methods=[c11|ts-18661-3]. This option will restrict
the compiler to only using the C99/C11 values for FLT_EVAL_METHOD, and is
set to -fpermitted-flt-eval-methods=c11 by default when in a standards
compliant mode like -std=c11.

Patch 6/11 does the work of rewriting the excess precision logic along
the guidelines given in https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00410.html

Patch 7/11 finishes the hookization of TARGET_FLT_EVAL_METHOD by poisoning
the old macro.

Patch 8/11 removes the restriction in targhooks.c:default_floatn_mode that
currently disables HFmode support. This patch is not strictly necessary,
and an alternative route to the same goal would override TARGET_FLOATN_MODE
in AArch64. However, now we've cleaned up the excess precision logic,
I don't believe there is any reason this patch would not be correct.

Patch 9/11 imports the soft-fp changes from
https://sourceware.org/ml/libc-alpha/2016-09/msg00310.html to libgcc.

Patch 10/11 enables these conversion routines for AArch64.

And finally patch 11/11 adds support for the various hooks that the AArch64
back-end requires to turn _Float16 support on.

The patch series as a whole passes a bootstrap and test cycle on x86_64
and AArch64. I've bootstrapped and tested each patch individually on x86_64.
All new tests added and enabled pass as appropriate.

The AArch64 enablement requires
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00268.html

OK?

Thanks,
James

ChangeLogs:

[Patch 1/11] Add a new target hook for describing excess precision intentions

  gcc/

  2016-09-30  James Greenhalgh  

* target.def (excess_precision): New hook.
* target.h (flt_eval_method): New.
(excess_precision_type): Likewise.
* targhooks.c (default_excess_precision): New.
* targhooks.h (default_excess_precision): New.
* doc/tm.texi.in (TARGET_EXCESS_PRECISION): New.
* doc/tm.texi: Regenerate.

[Patch 2/11] Implement TARGET_C_EXCESS_PRECISION for i386

  gcc/

  2016-09-30  James Greenhalgh  

* config/i386/i386.c (ix86_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

[Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

  gcc/

  2016-09-30  James Greenhalgh  

* config/s390/s390.c (s390_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

[Patch 4/11] Implement TARGET_C_EXCESS_PRECISION for m68k

  gcc/

  2016-09-30  James Greenhalgh  

* config/m68k/m68k.c (m68k_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

[Patch 5/11] Add -fpermitted-flt-eval-methods=[c11|ts-18661-3]

  gcc/c-family/

  2016-09-30  James Greenhalgh  

* c-opts.c (c_common_post_options): Add logic to handle the default
case for -fpermitted-flt-eval-methods.

  gcc/

  2016-09-30  James Greenhalgh  

* common.opt (fpermitted-flt-eval-methods): New.
* doc/invoke.texi (-fpermitted-flt-eval-methods): Document it.
* flag_types.h (permitted_flt_eval_methods): New.

  gcc/testsuite/

  2016-09-30  James Greenhalgh  

* gcc.dg/fpermitted-flt-eval-methods_1.c: New.
* gcc.dg/fpermitted-flt-eval-methods_2.c: New.

[Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

  gcc/

  2016-09-30  James Greenhalgh  

* toplev.c (init_excess_precision): Delete most logic.
* tree.c (excess_precision_type): Rewrite to use
TARGET_EXCESS_PRECISION.
* doc/invoke.texi (-fexcess-precision): Document behaviour in a
more generic fashion.

  gcc/c-family/

  2016-09-30  James Greenhalgh  

* c-common.c (excess_precision_mode_join): New.
(c_ts18661_flt_eval_method): New.
(c_c11_flt_eval_method): Likewise.
(c_flt_eval_method): Likewise.
* c-common.h (excess_precision_mode_join): New.
(c_flt_eval_method): Likewise.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New.
(cpp_iec_559_value): Call it.
(c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__,
call c_flt_eval_method to set __FLT_EVAL_METHOD__ and
__FLT_EVAL_METHOD_C99__.

[Patch 7/1

[Patch 5/11] Add -fpermitted-flt-eval-methods=[c11|ts-18661-3]

2016-09-30 Thread James Greenhalgh

Hi,

This option is added to control which values of FLT_EVAL_METHOD the
compiler is allowed to set.

ISO/IEC TS 18661-3 defines new permissible values for
FLT_EVAL_METHOD that indicate that operations and constants with
a semantic type that is an interchange or extended format should be
evaluated to the precision and range of that type.  These new values are
a superset of those permitted under C99/C11, which does not specify the
meaning of other positive values of FLT_EVAL_METHOD.  As such, code
conforming to C11 may not have been written expecting the possibility of
the new values.

-fpermitted-flt-eval-methods specifies whether the compiler
should allow only the values of FLT_EVAL_METHOD specified in C99/C11,
or the extended set of values specified in ISO/IEC TS 18661-3.

The two possible values this option can take are "c11" or "ts-18661-3".

The default when in a standards compliant mode (-std=c11 or similar)
is -fpermitted-flt-eval-methods=c11.  The default when in a GNU
dialect (-std=gnu11 or similar) is -fpermitted-flt-eval-methods=ts-18661-3.

I've added two testcases which test that when this option, or a C standards
dialect, would restrict the range of values to {-1, 0, 1, 2}, those are
the only values we see. At this stage in the patch series this trivially
holds for all targets.

Bootstrapped on x86_64 with no issues and tested in series on AArch64.

OK?

Thanks,
James

---
gcc/c-family/

2016-09-30  James Greenhalgh  

* c-opts.c (c_common_post_options): Add logic to handle the default
case for -fpermitted-flt-eval-methods.

gcc/

2016-09-30  James Greenhalgh  

* common.opt (fpermitted-flt-eval-methods): New.
* doc/invoke.texi (-fpermitted-flt-eval-methods): Document it.
* flag_types.h (permitted_flt_eval_methods): New.

gcc/testsuite/

2016-09-30  James Greenhalgh  

* gcc.dg/fpermitted-flt-eval-methods_1.c: New.
* gcc.dg/fpermitted-flt-eval-methods_2.c: New.

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index c5a699d..af8d7fe 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -789,6 +789,18 @@ c_common_post_options (const char **pfilename)
   && flag_unsafe_math_optimizations == 0)
 flag_fp_contract_mode = FP_CONTRACT_OFF;
 
+  /* If we are compiling C, and we are outside of a standards mode,
+ we can permit the new values from ISO/IEC TS 18661-3 for
+ FLT_EVAL_METHOD.  Otherwise, we must restrict the possible values to
+ the set specified in ISO C99/C11.  */
+  if (!flag_iso
+  && !c_dialect_cxx ()
+  && (global_options_set.x_flag_permitted_flt_eval_methods
+	  == PERMITTED_FLT_EVAL_METHODS_DEFAULT))
+flag_permitted_flt_eval_methods = PERMITTED_FLT_EVAL_METHODS_TS_18661;
+  else
+flag_permitted_flt_eval_methods = PERMITTED_FLT_EVAL_METHODS_C11;
+
   /* By default we use C99 inline semantics in GNU99 or C99 mode.  C99
  inline semantics are not supported in GNU89 or C89 mode.  */
   if (flag_gnu89_inline == -1)
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e01577..3a22aa0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1305,6 +1305,21 @@ Enum(excess_precision) String(fast) Value(EXCESS_PRECISION_FAST)
 EnumValue
 Enum(excess_precision) String(standard) Value(EXCESS_PRECISION_STANDARD)
 
+; Whether we permit the extended set of values for FLT_EVAL_METHOD
+; introduced in ISO/IEC TS 18661-3, or limit ourselves to those in C99/C11.
+fpermitted-flt-eval-methods=
+Common Joined RejectNegative Enum(permitted_flt_eval_methods) Var(flag_permitted_flt_eval_methods) Init(PERMITTED_FLT_EVAL_METHODS_DEFAULT)
+-fpermitted-flt-eval-methods=[c11|ts-18661]	Specify which values of FLT_EVAL_METHOD are permitted.
+
+Enum
+Name(permitted_flt_eval_methods) Type(enum permitted_flt_eval_methods) UnknownError(unknown specification for the set of FLT_EVAL_METHOD values to permit %qs)
+
+EnumValue
+Enum(permitted_flt_eval_methods) String(c11) Value(PERMITTED_FLT_EVAL_METHODS_C11)
+
+EnumValue
+Enum(permitted_flt_eval_methods) String(ts-18661-3) Value(PERMITTED_FLT_EVAL_METHODS_TS_18661)
+
 ffast-math
 Common Optimization
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8a84e4f..9cb0b54 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -375,7 +375,8 @@ Objective-C and Objective-C++ Dialects}.
 -flto-partition=@var{alg} -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
 -fmove-loop-invariants -fno-branch-count-reg @gol
--fno-defer-pop -fno-fp-int-builtin-inexact -fno-function-cse @gol
+-fno-defer-pop -fno-fp-int-builtin-inexact @gol
+-fpermitted-flt-eval-methods=@var{standard} -fno-function-cse @gol
 -fno-guess-branch-probability -fno-inline -fno-math-errno -fno-peephole @gol
 -fno-peephole2 -fno-sched-interblock -fno-sched-spec -fno-signed-zeros @gol
 -fno-toplevel-reorder -fno-trapping-math -fno-zero-initialized-in-bss @gol
@@ -8917,6 +8918,30 @@ Even if @option{-fno-fp-int-builtin-inexact} is used, if the fun

[Patch 8/11] Make _Float16 available if HFmode is available

2016-09-30 Thread James Greenhalgh

Hi,

Now that we've worked on -fexcess-precision, the comment in targhooks.c
no longer holds. We can now permit _Float16 on any target which provides
HFmode and supports HFmode in libgcc.

Bootstrapped and tested on x86-64, and in series on AArch64.

OK?

Thanks,
James

---
2016-09-30  James Greenhalgh  

* targhooks.c (default_floatn_mode): Enable _Float16 if a target
provides HFmode.

diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 08d0b35..bf94b2a 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -513,10 +513,12 @@ default_floatn_mode (int n, bool extended)
   switch (n)
 	{
 	case 16:
-	  /* We do not use HFmode for _Float16 by default because the
-	 required excess precision support is not present and the
-	 interactions with promotion of the older __fp16 need to
-	 be worked out.  */
+	  /* Always enable _Float16 if we have basic support for the mode.
+	 Targets can control the range and precision of operations on
+	 the _Float16 type using TARGET_C_EXCESS_PRECISION.  */
+#ifdef HAVE_HFmode
+	  cand = HFmode;
+#endif
 	  break;
 
 	case 32:


[Patch 1/11] Add a new target hook for describing excess precision intentions

2016-09-30 Thread James Greenhalgh

Hi,

This patch introduces TARGET_C_EXCESS_PRECISION. This hook takes a tri-state
argument, one of EXCESS_PRECISION_TYPE_IMPLICIT,
EXCESS_PRECISION_TYPE_STANDARD, EXCESS_PRECISION_TYPE_FAST. Which relate to
the implicit extra precision added by the target, the excess precision that
should be guaranteed for -fexcess-precision=standard, and the excess
precision that should be added for performance under -fexcess-precision=fast .

Bootstrapped and tested in sequence with the other patches in this series
on Arch64, and as a standalone patch on x86_64.

OK?

Thanks
James

---
gcc/

2016-09-30  James Greenhalgh  

* target.def (excess_precision): New hook.
* target.h (flt_eval_method): New.
(excess_precision_type): Likewise.
* targhooks.c (default_excess_precision): New.
* targhooks.h (default_excess_precision): New.
* doc/tm.texi.in (TARGET_EXCESS_PRECISION): New.
* doc/tm.texi: Regenerate.
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index fe1e984..26b0fa3 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -359,6 +359,24 @@ enum memmodel
   MEMMODEL_SYNC_SEQ_CST = MEMMODEL_SEQ_CST | MEMMODEL_SYNC
 };
 
+/* enums used by the targetm.excess_precision hook.  */
+
+enum flt_eval_method
+{
+  FLT_EVAL_METHOD_UNPREDICTABLE = -1,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT = 0,
+  FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE = 1,
+  FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE = 2,
+  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 = 16
+};
+
+enum excess_precision_type
+{
+  EXCESS_PRECISION_TYPE_IMPLICIT,
+  EXCESS_PRECISION_TYPE_STANDARD,
+  EXCESS_PRECISION_TYPE_FAST
+};
+
 /* Support for user-provided GGC and PCH markers.  The first parameter
is a pointer to a pointer, the second a cookie.  */
 typedef void (*gt_pointer_operator) (void *, void *);
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8a98ba4..0bdae58 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -947,6 +947,10 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@deftypefn {Target Hook} {enum flt_eval_method} TARGET_C_EXCESS_PRECISION (enum excess_precision_type @var{type})
+Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that describes which excess precision should be applied.  @var{type} is either @code{EXCESS_PRECISION_TYPE_IMPLICIT}, @code{EXCESS_PRECISION_TYPE_FAST}, or @code{EXCESS_PRECISION_TYPE_STANDARD}.  For @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which precision and range operations will be implictly evaluated in regardless of the excess precision explicitly added.  For @code{EXCESS_PRECISION_TYPE_STANDARD} and @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the explicit excess precision that should be added depending on the value set for @code{-fexcess-precision=[standard|fast]}.
+@end deftypefn
+
 @deftypefn {Target Hook} machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree @var{type}, machine_mode @var{mode}, int *@var{punsignedp}, const_tree @var{funtype}, int @var{for_return})
 Like @code{PROMOTE_MODE}, but it is applied to outgoing function arguments or
 function return values.  The target hook should return the new mode
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f1cfc86..7c7af33 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -921,6 +921,8 @@ sign-extend the result to 64 bits.  On such machines, set
 Do not define this macro if it would never modify @var{m}.
 @end defmac
 
+@hook TARGET_C_EXCESS_PRECISION
+
 @hook TARGET_PROMOTE_FUNCTION_MODE
 
 @defmac PARM_BOUNDARY
diff --git a/gcc/target.def b/gcc/target.def
index 83373a5..1542692 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5402,6 +5402,23 @@ DEFHOOK_UNDOC
  machine_mode, (char c),
  default_mode_for_suffix)
 
+DEFHOOK
+(excess_precision,
+ "Return a value, with the same meaning as @code{FLT_EVAL_METHOD} C that\
+ describes which excess precision should be applied.  @var{type} is\
+ either @code{EXCESS_PRECISION_TYPE_IMPLICIT},\
+ @code{EXCESS_PRECISION_TYPE_FAST}, or\
+ @code{EXCESS_PRECISION_TYPE_STANDARD}.  For\
+ @code{EXCESS_PRECISION_TYPE_IMPLICIT}, the target should return which\
+ precision and range operations will be implictly evaluated in regardless\
+ of the excess precision explicitly added.  For\
+ @code{EXCESS_PRECISION_TYPE_STANDARD} and\
+ @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the\
+ explicit excess precision that should be added depending on the\
+ value set for @code{-fexcess-precision=[standard|fast]}.",
+ enum flt_eval_method, (enum excess_precision_type type),
+ default_excess_precision)
+
 HOOK_VECTOR_END (c)
 
 /* Functions specific to the C++ frontend.  */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index d75650f..08d0b35 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2110,4 +2110,12 @@ default_max_noce_ifcvt_seq_cost (edge e)
 return BRANCH_COST (true, predictable_p) * COSTS_N_INSNS (3);
 }
 
+/* Default i

[Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

2016-09-30 Thread James Greenhalgh

Hi,

This patch moves the logic for excess precision from using the
TARGET_FLT_EVAL_METHOD macro to the TARGET_EXCESS_PRECISION hook
introduced earlier in the patch series.

These logic changes follow Joseph's comments at
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00410.html

Briefly; we have four things to change.

  1) The logic in tree.c::excess_precision_type .
  Here we want to ask the target which excess preicion it would like for
  whichever of -fexcess-precision=standard or -fexcess-precision=fast is
  in use, then apply that.

  2) The logic in c-family/c-cppbuiltin.c::c_cpp_flt_eval_method_iec_559 .
  We want to update this to ensure that the target claims the same excess
  precision to be implicitly added to operations that it reports in
  -fexcess-precision=standard mode. We take the join of these two reported
  values, and only if the join is equal to the excess precision requested
  for -fexcess-precision=standard can we set the IEC_559 macro.

  3) The logic in c-family/c-cppbuiltin.c::c_cpp_builtin for setting
  __FLT_EVAL_METHOD__ .
  Which is now little more complicated, and makes use of
  -fpermitted-flt-eval-methods from patch 5.

  4) The logic in c-family/c-cppbuiltin.c::c_cpp_builtin for setting
  __LIBGCC_*_EXCESS_PRECISION__ .
  This can just be the implicit precision reported by the target.

Having moved the logic in to those areas, we can simplify
toplev.c::init_excess_precision , which now only retains the assert that
-fexcess-precision=default has been rewritten by the language front-end, and
the set from the command-line variable to the internal variable.

The documentation in invoke.texi is not quite right for the impact of
-fexcess-precision, so I've rewritten the text to read a little more
generic.

Bootstrapped on x86_64 and aarch64 with no issues.

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* toplev.c (init_excess_precision): Delete most logic.
* tree.c (excess_precision_type): Rewrite to use
TARGET_EXCESS_PRECISION.
* doc/invoke.texi (-fexcess-precision): Document behaviour in a
more generic fashion.

gcc/c-family/

2016-09-30  James Greenhalgh  

* c-common.c (excess_precision_mode_join): New.
(c_ts18661_flt_eval_method): New.
(c_c11_flt_eval_method): Likewise.
(c_flt_eval_method): Likewise.
* c-common.h (excess_precision_mode_join): New.
(c_flt_eval_method): Likewise.
* c-cppbuiltin.c (c_cpp_flt_eval_method_iec_559): New.
(cpp_iec_559_value): Call it.
(c_cpp_builtins): Modify logic for __LIBGCC_*_EXCESS_PRECISION__,
call c_flt_eval_method to set __FLT_EVAL_METHOD__ and
__FLT_EVAL_METHOD_C99__.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 2652259..983f71a 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -13145,4 +13145,83 @@ diagnose_mismatched_attributes (tree olddecl, tree newdecl)
   return warned;
 }
 
+/* Return the latice point which is the wider of the two FLT_EVAL_METHOD
+   modes X, Y.  This isn't just  >, as the FLT_EVAL_METHOD values added
+   by C TS 18661-3 for interchange  types that are computed in their
+   native precision are larger than the C11 values for evaluating in the
+   precision of float/double/long double.  If either mode is
+   FLT_EVAL_METHOD_UNPREDICTABLE, return that.  */
+
+enum flt_eval_method
+excess_precision_mode_join (enum flt_eval_method x,
+			enum flt_eval_method y)
+{
+  if (x == FLT_EVAL_METHOD_UNPREDICTABLE
+  || y == FLT_EVAL_METHOD_UNPREDICTABLE)
+return FLT_EVAL_METHOD_UNPREDICTABLE;
+
+  /* GCC only supports one interchange type right now, _Float16.  If
+ we're evaluating _Float16 in 16-bit precision, then flt_eval_method
+ will be FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16.  */
+  if (x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16)
+return y;
+  if (y == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16)
+return x;
+
+  /* Other values for flt_eval_method are directly comparable, and we want
+ the maximum.  */
+  return MAX (x, y);
+}
+
+/* Return the value that should be set for FLT_EVAL_METHOD in the
+   context of ISO/IEC TS 18861-3.
+
+   This should relate to the effective excess precision seen by the user,
+   which is the join point of the precision the target requests for
+   -fexcess-precision={standard,fast} and the implicit excess precision
+   the target uses.  */
+
+static enum flt_eval_method
+c_ts18661_flt_eval_method (void)
+{
+  enum flt_eval_method implicit
+= targetm.c.excess_precision (EXCESS_PRECISION_TYPE_IMPLICIT);
+
+  enum excess_precision_type flag_type
+= (flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD
+   ? EXCESS_PRECISION_TYPE_STANDARD
+   : EXCESS_PRECISION_TYPE_FAST);
+
+  enum flt_eval_method requested
+= targetm.c.excess_precision (flag_type);
+
+  return excess_precision_mode_join (implicit, requested);
+}
+
+/* As c_cpp_ts18661_flt_eval_method, but clamps the expec

[Patch 7/11] Delete TARGET_FLT_EVAL_METHOD and poison it.

2016-09-30 Thread James Greenhalgh

Hi,

We've removed all uses of TARGET_FLT_EVAL_METHOD, so we can remove it
and poison it.

Bootstrapped and tested on x86-64 and AArch64. Tested on s390 and m68k
to the best of my ability (no execute tests).

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/s390/s390.h (TARGET_FLT_EVAL_METHOD): Delete.
* config/m68k/m68k.h (TARGET_FLT_EVAL_METHOD): Delete.
* config/i386/i386.h (TARGET_FLT_EVAL_METHOD): Delete.
* defaults.h (TARGET_FLT_EVAL_METHOD): Delete.
* doc/tm.texi.in (TARGET_FLT_EVAL_METHOD): Delete.
* doc/tm.texi: Regenerate.
* system.h (TARGET_FLT_EVAL_METHOD): Poison.

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8751143..b5e4d61 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -690,17 +690,6 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
   SUBTARGET_EXTRA_SPECS
 
 
-/* Set the value of FLT_EVAL_METHOD in float.h.  When using only the
-   FPU, assume that the fpcw is set to extended precision; when using
-   only SSE, rounding is correct; when using both SSE and the FPU,
-   the rounding precision is indeterminate, since either may be chosen
-   apparently at random.  */
-#define TARGET_FLT_EVAL_METHOD		\
-  (TARGET_80387\
-   ? (TARGET_MIX_SSE_I387 ? -1		\
-  : (TARGET_SSE_MATH ? (TARGET_SSE2 ? 0 : -1) : 2))			\
-   : 0)
-
 /* Whether to allow x87 floating-point arithmetic on MODE (one of
SFmode, DFmode and XFmode) in the current excess precision
configuration.  */
diff --git a/gcc/config/m68k/m68k.h b/gcc/config/m68k/m68k.h
index 2aa858f..2021e9d 100644
--- a/gcc/config/m68k/m68k.h
+++ b/gcc/config/m68k/m68k.h
@@ -281,11 +281,6 @@ along with GCC; see the file COPYING3.  If not see
 #define LONG_DOUBLE_TYPE_SIZE			\
   ((TARGET_COLDFIRE || TARGET_FIDOA) ? 64 : 80)
 
-/* Set the value of FLT_EVAL_METHOD in float.h.  When using 68040 fp
-   instructions, we get proper intermediate rounding, otherwise we
-   get extended precision results.  */
-#define TARGET_FLT_EVAL_METHOD ((TARGET_68040 || ! TARGET_68881) ? 0 : 2)
-
 #define BITS_BIG_ENDIAN 1
 #define BYTES_BIG_ENDIAN 1
 #define WORDS_BIG_ENDIAN 1
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 3a7be1a..1a2c150 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -247,11 +247,6 @@ extern const char *s390_host_detect_local_cpu (int argc, const char **argv);
 #define S390_TDC_INFINITY (S390_TDC_POSITIVE_INFINITY \
 			  | S390_TDC_NEGATIVE_INFINITY )
 
-/* This is used by float.h to define the float_t and double_t data
-   types.  For historical reasons both are double on s390 what cannot
-   be changed anymore.  */
-#define TARGET_FLT_EVAL_METHOD 1
-
 /* Target machine storage layout.  */
 
 /* Everything is big-endian.  */
diff --git a/gcc/defaults.h b/gcc/defaults.h
index c62c844..210a3c5 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -963,9 +963,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define REG_WORDS_BIG_ENDIAN WORDS_BIG_ENDIAN
 #endif
 
-#ifndef TARGET_FLT_EVAL_METHOD
-#define TARGET_FLT_EVAL_METHOD 0
-#endif
 
 #ifndef TARGET_DEC_EVAL_METHOD
 #define TARGET_DEC_EVAL_METHOD 2
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 0bdae58..9c10ea8 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1566,13 +1566,6 @@ uses this macro should also arrange to use @file{t-gnu-prefix} in
 the libgcc @file{config.host}.
 @end defmac
 
-@defmac TARGET_FLT_EVAL_METHOD
-A C expression for the value for @code{FLT_EVAL_METHOD} in @file{float.h},
-assuming, if applicable, that the floating-point control word is in its
-default state.  If you do not define this macro the value of
-@code{FLT_EVAL_METHOD} will be zero.
-@end defmac
-
 @defmac WIDEST_HARDWARE_FP_SIZE
 A C expression for the size in bits of the widest floating-point format
 supported by the hardware.  If you define this macro, you must specify a
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 7c7af33..0a4d19b 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1402,13 +1402,6 @@ uses this macro should also arrange to use @file{t-gnu-prefix} in
 the libgcc @file{config.host}.
 @end defmac
 
-@defmac TARGET_FLT_EVAL_METHOD
-A C expression for the value for @code{FLT_EVAL_METHOD} in @file{float.h},
-assuming, if applicable, that the floating-point control word is in its
-default state.  If you do not define this macro the value of
-@code{FLT_EVAL_METHOD} will be zero.
-@end defmac
-
 @defmac WIDEST_HARDWARE_FP_SIZE
 A C expression for the size in bits of the widest floating-point format
 supported by the hardware.  If you define this macro, you must specify a
diff --git a/gcc/system.h b/gcc/system.h
index cc353f5..df7a398 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -896,7 +896,7 @@ extern void fancy_abort (const char *, int, const char *) ATTRIBUTE_NORETURN;
 	ASM_BYTE_OP MEMBER_TYPE_FORCES_BLK LIBGCC2_HAS_SF_MODE		\
 	LIBG

Re: [PATCH][v4] GIMPLE store merging pass

2016-09-30 Thread Kyrill Tkachov

Hi Richard,

On 29/09/16 11:45, Richard Biener wrote:


+  gimple_seq seq = NULL;
+  unsigned int num_stmts = 0;
+  tree offset_type = get_type_for_merged_store (group);
+  tree last_vdef, new_vuse;
+  last_vdef = gimple_vdef (group->last_stmt);
+  new_vuse = gimple_vuse (group->last_stmt);
+  location_t loc = get_merged_store_location (group);
If you end up splitting the store then please use a location appropriate
for the split part.  Likewise for the alias type.



How would I get the appropriate alias type?
Is there some way to construct it from the alias type of the base
object offset by some number of bytes?


Thanks,
Kyrill


[Patch 2/11] Implement TARGET_C_EXCESS_PRECISION for i386

2016-09-30 Thread James Greenhalgh

Hi,

This patch ports the logic from i386's TARGET_FLT_EVAL_METHOD to the new
target hook TARGET_C_EXCESS_PRECISION.

Bootstrapped and tested with no issues.

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/i386/i386.c (ix86_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e0b2d57..3c801c2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -50359,6 +50359,45 @@ ix86_addr_space_zero_address_valid (addr_space_t as)
 {
   return as != ADDR_SPACE_GENERIC;
 }
+
+/* Set the value of FLT_EVAL_METHOD in float.h.  When using only the
+   FPU, assume that the fpcw is set to extended precision; when using
+   only SSE, rounding is correct; when using both SSE and the FPU,
+   the rounding precision is indeterminate, since either may be chosen
+   apparently at random.  */
+
+static enum flt_eval_method
+ix86_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_FAST:
+	/* The fastest type to promote to will always be the native type,
+	   whether that occurs with implicit excess precision or
+	   otherwise.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+  case EXCESS_PRECISION_TYPE_STANDARD:
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+	/* Otherwise, the excess precision we want when we are
+	   in a standards compliant mode, and the implicit precision we
+	   provide can be identical.  */
+	if (!TARGET_80387)
+	  return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+	else if (TARGET_MIX_SSE_I387)
+	  return FLT_EVAL_METHOD_UNPREDICTABLE;
+	else if (!TARGET_SSE_MATH)
+	  return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
+	else if (TARGET_SSE2)
+	  return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+	else
+	  return FLT_EVAL_METHOD_UNPREDICTABLE;
+  default:
+	gcc_unreachable ();
+}
+
+  return FLT_EVAL_METHOD_UNPREDICTABLE;
+}
+
 #undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
 #define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID ix86_addr_space_zero_address_valid
 
@@ -50563,6 +50602,8 @@ ix86_addr_space_zero_address_valid (addr_space_t as)
 #undef TARGET_MD_ASM_ADJUST
 #define TARGET_MD_ASM_ADJUST ix86_md_asm_adjust
 
+#undef TARGET_C_EXCESS_PRECISION
+#define TARGET_C_EXCESS_PRECISION ix86_excess_precision
 #undef TARGET_PROMOTE_PROTOTYPES
 #define TARGET_PROMOTE_PROTOTYPES hook_bool_const_tree_true
 #undef TARGET_SETUP_INCOMING_VARARGS


[Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

2016-09-30 Thread James Greenhalgh

Hi,

This patch ports the logic from s390's TARGET_FLT_EVAL_METHOD to the new
target hook TARGET_C_EXCESS_PRECISION.

Patch tested by building an s390-none-linux toolchain and running
s390.exp (without the ability to execute) with no regressions, and manually
inspecting the output assembly code when compiling
testsuite/gcc.target/i386/excess-precision* to show no difference in
code-generation.

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/s390/s390.c (s390_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3bdb648..b704d46 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -15106,6 +15106,34 @@ s390_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree ty
   return NULL;
 }
 
+/* Implement TARGET_EXCESS_PRECISION.
+
+   This is used by float.h to define the float_t and double_t data
+   types.  For historical reasons both are double on s390 what cannot
+   be changed anymore.  */
+
+static enum flt_eval_method
+s390_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_FAST:
+	/* The fastest type to promote to will always be the native type,
+	   whether that occurs with implicit excess precision or
+	   otherwise.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+  case EXCESS_PRECISION_TYPE_STANDARD:
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+	/* Otherwise, the excess precision we want when we are
+	   in a standards compliant mode, and the implicit precision we
+	   provide can be identical.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE;
+  default:
+	gcc_unreachable ();
+}
+  return FLT_EVAL_METHOD_UNPREDICTABLE;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -15161,6 +15189,9 @@ s390_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree ty
 #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
 #define TARGET_ASM_CAN_OUTPUT_MI_THUNK hook_bool_const_tree_hwi_hwi_const_tree_true
 
+#undef TARGET_C_EXCESS_PRECISION
+#define TARGET_C_EXCESS_PRECISION s390_excess_precision
+
 #undef  TARGET_SCHED_ADJUST_PRIORITY
 #define TARGET_SCHED_ADJUST_PRIORITY s390_adjust_priority
 #undef TARGET_SCHED_ISSUE_RATE


Re: [PATCH][v4] GIMPLE store merging pass

2016-09-30 Thread Richard Biener
On September 30, 2016 4:43:10 PM GMT+02:00, Kyrill Tkachov 
 wrote:
>
>On 30/09/16 15:36, Kyrill Tkachov wrote:
>> Hi Richard,
>>
>> On 29/09/16 11:45, Richard Biener wrote:
>>>
 +
 +  /* In some cases get_inner_reference may return a
 + MEM_REF [ptr + byteoffset].  For the purposes of this
>pass
 + canonicalize the base_addr to MEM_REF [ptr] and take
 + byteoffset into account in the bitpos.  This occurs in
 + PR 23684 and this way we can catch more chains.  */
 +  if (TREE_CODE (base_addr) == MEM_REF
 +  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (base_addr,
>0)))
 +  && TREE_CODE (TREE_OPERAND (base_addr, 1)) ==
>INTEGER_CST
>>> This is always an INTEGER_CST.
>>>
 +  && tree_fits_shwi_p (TREE_OPERAND (base_addr, 1))
>>> This will never allow negative offsets (but maybe this is a good
>thing?)
>>>
>>> )
 +{
 +  bitpos += tree_to_shwi (TREE_OPERAND (base_addr, 1))
 +* BITS_PER_UNIT;
>>> this multiplication may overflow.  There is mem_ref_offset () which
>>> you should really use here, see get_inner_reference itself (and
>>> how to translate back from offset_int to HOST_WIDE_INT if it fits).
>>>
 +
 +  base_addr = fold_build2 (MEM_REF, TREE_TYPE (base_addr),
 +   TREE_OPERAND (base_addr, 0),
 +   build_zero_cst (TREE_TYPE (
 + TREE_OPERAND (base_addr, 1;
>>> Ugh, building a tree node ... you could use TREE_OPERAND (base_addr,
>0)
>>> as base_addr instead?
>>
>> This didn't work for me because aliasing info was lost.
>> So in the example:
>> void
>> foo2 (struct bar *p, struct bar *p2)
>> {
>>   p->b = 0xff;
>>   p2->b = 0xa;
>>   p->a = 0xf;
>>   p2->c = 0xc;
>>   p->c = 0xff;
>>   p2->d = 0xbf;
>>   p->d = 0xfff;
>> }
>>
>> we end up merging p->b with p->a even though the p2->b store may
>alias.
>> We'll record the base objects as being 'p' and 'p2' whereas with my
>approach
>> we record them as '*p' and '*p2'. I don't suppose I could just do:
>> TREE_OPERAND (base_addr, 1) = build_zero_cst (TREE_TYPE (TREE_OPERAND
>(base_addr, 1)));
>> ?
>>
>
>Although I think I could try to make it work by using
>ptr_derefs_may_alias_p in the alias checks
>a bit more. I'll see what I can do.

I don't think this will be enough.  Don't bother with this comment too much, 
it's a minor issue.

Richard.

>Kyrill
>
>> Thanks,
>> Kyrill
>>
>>>
 +}
 +
 +  struct imm_store_chain_info **chain_info
 += m_stores.get (base_addr);
 +
 +  if (!invalid)
 +{
 +  store_immediate_info *info;
 +  if (chain_info)
 +{
 +  info = new store_immediate_info (
 +bitsize, bitpos, rhs, lhs, stmt,
 +(*chain_info)->m_store_info.length ());
 +  if (dump_file)
 +{
 +  fprintf (dump_file,
 +   "Recording immediate store from stmt:\n");
 +  print_gimple_stmt (dump_file, stmt, 0, 0);
 +}
 +  (*chain_info)->m_store_info.safe_push (info);
 +  continue;
 +}
 +
 +  /* Store aliases any existing chain?  */
 +  terminate_all_aliasing_chains (lhs, base_addr, stmt);
 +
 +  /* Start a new chain.  */
 +  struct imm_store_chain_info *new_chain
 += new imm_store_chain_info;
 +  info = new store_immediate_info (bitsize, bitpos, rhs,
>lhs,
 +   stmt, 0);
 +  new_chain->m_store_info.safe_push (info);
 +  m_stores.put (base_addr, new_chain);
 +  if (dump_file)
 +{
 +  fprintf (dump_file,
 +   "Starting new chain with statement:\n");
 +  print_gimple_stmt (dump_file, stmt, 0, 0);
 +  fprintf (dump_file, "The base object is:\n");
 +  print_generic_expr (dump_file, base_addr, 0);
 +  fprintf (dump_file, "\n");
 +}
 +}
 +  else
 +terminate_all_aliasing_chains (lhs, base_addr, stmt);
 +
 +  continue;
 +}
 +
 +  terminate_all_aliasing_chains (NULL_TREE, NULL_TREE, stmt);
 +}
 +  terminate_and_process_all_chains (bb);
 +}
 +  return 0;
 +}
 +
 +} // anon namespace
 +
 +/* Construct and return a store merging pass object.  */
 +
 +gimple_opt_pass *
 +make_pass_store_merging (gcc::context *ctxt)
 +{
 +  return new pass_store_merging (ctxt);
 +}
 diff --git a/gcc/opts.c b/gcc/opts.c
 index 45f1f89c..e63d7e4 100644
 --- a/gcc/opts.c
 +++ b/gcc/opts.c
 @@ -463,6 +463,7 @@ static const struct de

[Patch 4/11] Implement TARGET_C_EXCESS_PRECISION for m68k

2016-09-30 Thread James Greenhalgh

Hi,

This patch ports the logic from m68k's TARGET_FLT_EVAL_METHOD to the new
target hook TARGET_C_EXCESS_PRECISION.

Patch tested by building an m68k-none-elf toolchain and running
m68k.exp (without the ability to execute) with no regressions, and manually
inspecting the output assembly code when compiling
testsuite/gcc.target/i386/excess-precision* to show no difference in
code-generation.

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/m68k/m68k.c (m68k_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.

diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c
index a104193..c858d7e 100644
--- a/gcc/config/m68k/m68k.c
+++ b/gcc/config/m68k/m68k.c
@@ -182,6 +182,8 @@ static rtx m68k_function_arg (cumulative_args_t, machine_mode,
 static bool m68k_cannot_force_const_mem (machine_mode mode, rtx x);
 static bool m68k_output_addr_const_extra (FILE *, rtx);
 static void m68k_init_sync_libfuncs (void) ATTRIBUTE_UNUSED;
+static enum flt_eval_method
+m68k_excess_precision (enum excess_precision_type);
 
 /* Initialize the GCC target structure.  */
 
@@ -322,6 +324,9 @@ static void m68k_init_sync_libfuncs (void) ATTRIBUTE_UNUSED;
 #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
 #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA m68k_output_addr_const_extra
 
+#undef TARGET_EXCESS_PRECISION
+#define TARGET_EXCESS_PRECISION m68k_excess_precision
+
 /* The value stored by TAS.  */
 #undef TARGET_ATOMIC_TEST_AND_SET_TRUEVAL
 #define TARGET_ATOMIC_TEST_AND_SET_TRUEVAL 128
@@ -6528,4 +6533,36 @@ m68k_epilogue_uses (int regno ATTRIBUTE_UNUSED)
 	  == m68k_fk_interrupt_handler));
 }
 
+
+/* Implement TARGET_EXCESS_PRECISION.
+
+   Set the value of FLT_EVAL_METHOD in float.h.  When using 68040 fp
+   instructions, we get proper intermediate rounding, otherwise we
+   get extended precision results.  */
+
+static enum flt_eval_method
+m68k_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_FAST:
+	/* The fastest type to promote to will always be the native type,
+	   whether that occurs with implicit excess precision or
+	   otherwise.  */
+	return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+  case EXCESS_PRECISION_TYPE_STANDARD:
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+	/* Otherwise, the excess precision we want when we are
+	   in a standards compliant mode, and the implicit precision we
+	   provide can be identical.  */
+	if (TARGET_68040 || ! TARGET_68881)
+	  return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
+
+	return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE;
+  default:
+	gcc_unreachable ();
+}
+  return FLT_EVAL_METHOD_UNPREDICTABLE;
+}
+
 #include "gt-m68k.h"


[Patch libgcc 9/11] Update soft-fp from glibc

2016-09-30 Thread James Greenhalgh

Hi,

This patch merges in the support added to glibc for HFmode conversions in
this patch:

commit 87ab10d6524fe4faabd7eb3eac5868165ecfb323
Author: James Greenhalgh 
Date:   Wed Sep 21 21:02:54 2016 +

[soft-fp] Add support for various half-precision conversion routines.

This patch adds conversion routines required for _Float16 support in
AArch64.

These are one-step conversions to and from TImode and TFmode. We need
these on AArch64 regardless of presence of the ARMv8.2-A 16-bit
floating-point extensions.

In the patch, soft-fp/half.h is derived from soft-fp/single.h .  The
conversion routines are derivatives of their respective SFmode
variants.

* soft-fp/extendhftf2.c: New.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/trunctfhf2.c: Likewise.

Any patch merging from upstream is preapproved acording to our commit
policies, but I'll hold off on committing it until the others in this
series have been approved.

Thanks,
James

---
libgcc/

2016-09-30  James Greenhalgh  

* soft-fp/extendhftf2.c: New.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/trunctfhf2.c: Likewise.

diff --git a/libgcc/soft-fp/extendhftf2.c b/libgcc/soft-fp/extendhftf2.c
new file mode 100644
index 000..6ff6438
--- /dev/null
+++ b/libgcc/soft-fp/extendhftf2.c
@@ -0,0 +1,53 @@
+/* Software floating-point emulation.
+   Return an IEEE half converted to IEEE quad
+   Copyright (C) 1997-2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
+
+#define FP_NO_EXACT_UNDERFLOW
+#include "soft-fp.h"
+#include "half.h"
+#include "quad.h"
+
+TFtype
+__extendhftf2 (HFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_H (A);
+  FP_DECL_Q (R);
+  TFtype r;
+
+  FP_INIT_EXCEPTIONS;
+  FP_UNPACK_RAW_H (A, a);
+#if (2 * _FP_W_TYPE_SIZE) < _FP_FRACBITS_Q
+  FP_EXTEND (Q, H, 4, 1, R, A);
+#else
+  FP_EXTEND (Q, H, 2, 1, R, A);
+#endif
+  FP_PACK_RAW_Q (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
diff --git a/libgcc/soft-fp/fixhfti.c b/libgcc/soft-fp/fixhfti.c
new file mode 100644
index 000..3610f4c
--- /dev/null
+++ b/libgcc/soft-fp/fixhfti.c
@@ -0,0 +1,45 @@
+/* Software floating-point emulation.
+   Convert IEEE half to 128bit signed integer
+   Copyright (C) 2007-2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser Ge

[Patch libgcc 9/11] Update soft-fp from glibc

2016-09-30 Thread James Greenhalgh

Hi,

This patch merges in the support added to glibc for HFmode conversions in
this patch:

commit 87ab10d6524fe4faabd7eb3eac5868165ecfb323
Author: James Greenhalgh 
Date:   Wed Sep 21 21:02:54 2016 +

[soft-fp] Add support for various half-precision conversion routines.

This patch adds conversion routines required for _Float16 support in
AArch64.

These are one-step conversions to and from TImode and TFmode. We need
these on AArch64 regardless of presence of the ARMv8.2-A 16-bit
floating-point extensions.

In the patch, soft-fp/half.h is derived from soft-fp/single.h .  The
conversion routines are derivatives of their respective SFmode
variants.

* soft-fp/extendhftf2.c: New.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/trunctfhf2.c: Likewise.

Any patch merging from upstream is preapproved acording to our commit
policies, but I'll hold off on committing it until the others in this
series have been approved.

Thanks,
James

---
libgcc/

2016-09-30  James Greenhalgh  

* soft-fp/extendhftf2.c: New.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/trunctfhf2.c: Likewise.

diff --git a/libgcc/soft-fp/extendhftf2.c b/libgcc/soft-fp/extendhftf2.c
new file mode 100644
index 000..6ff6438
--- /dev/null
+++ b/libgcc/soft-fp/extendhftf2.c
@@ -0,0 +1,53 @@
+/* Software floating-point emulation.
+   Return an IEEE half converted to IEEE quad
+   Copyright (C) 1997-2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
+
+#define FP_NO_EXACT_UNDERFLOW
+#include "soft-fp.h"
+#include "half.h"
+#include "quad.h"
+
+TFtype
+__extendhftf2 (HFtype a)
+{
+  FP_DECL_EX;
+  FP_DECL_H (A);
+  FP_DECL_Q (R);
+  TFtype r;
+
+  FP_INIT_EXCEPTIONS;
+  FP_UNPACK_RAW_H (A, a);
+#if (2 * _FP_W_TYPE_SIZE) < _FP_FRACBITS_Q
+  FP_EXTEND (Q, H, 4, 1, R, A);
+#else
+  FP_EXTEND (Q, H, 2, 1, R, A);
+#endif
+  FP_PACK_RAW_Q (r, R);
+  FP_HANDLE_EXCEPTIONS;
+
+  return r;
+}
diff --git a/libgcc/soft-fp/fixhfti.c b/libgcc/soft-fp/fixhfti.c
new file mode 100644
index 000..3610f4c
--- /dev/null
+++ b/libgcc/soft-fp/fixhfti.c
@@ -0,0 +1,45 @@
+/* Software floating-point emulation.
+   Convert IEEE half to 128bit signed integer
+   Copyright (C) 2007-2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser Ge

[Patch AArch64 11/11] Enable _Float16

2016-09-30 Thread James Greenhalgh

Hi,

Finally, this patch adds the back-end wiring to get AArch64 support for
the _Float16 type working.

Bootstrapped on AArch64 with no issues.

OK?

Thanks,
James

---
2016-09-30  James Greenhalgh  

* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Update
__FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__ when we switch
architecture levels.
* config/aarch64/aarch64.c (aarch64_promoted_type): Only promote
the aarch64_fp16_type_node, not all HFmode types.
(aarch64_libgcc_floating_mode_supported_p): Support HFmode.
(aarch64_scalar_mode_supported_p): Likewise.
(aarch64_excess_precision): New.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Define.
(TARGET_SCALAR_MODE_SUPPORTED_P): Likewise.
(TARGET_C_EXCESS_PRECISION): Likewise.

2016-09-30  James Greenhalgh  

* gcc.target/aarch64/_Float16_1.c: New.
* gcc.target/aarch64/_Float16_2.c: Likewise.
* gcc.target/aarch64/_Float16_3.c: Likewise.

diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 3380ed6..9982512 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -132,6 +132,16 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
 
   aarch64_def_or_undef (TARGET_CRYPTO, "__ARM_FEATURE_CRYPTO", pfile);
   aarch64_def_or_undef (TARGET_SIMD_RDMA, "__ARM_FEATURE_QRDMX", pfile);
+
+  /* Not for ACLE, but required to keep "float.h" correct if we switch
+ target between implementations that do or do not support ARMv8.2-A
+ 16-bit floating-point extensions.  */
+  cpp_undef (pfile, "__FLT_EVAL_METHOD__");
+  builtin_define_with_int_value ("__FLT_EVAL_METHOD__",
+ c_flt_eval_method (true));
+  cpp_undef (pfile, "__FLT_EVAL_METHOD_C99__");
+  builtin_define_with_int_value ("__FLT_EVAL_METHOD_C99__",
+ c_flt_eval_method (false));
 }
 
 /* Implement TARGET_CPU_CPP_BUILTINS.  */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cda85de..01e1ca0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14029,12 +14029,19 @@ aarch64_vec_fpconst_pow_of_2 (rtx x)
   return firstval;
 }
 
-/* Implement TARGET_PROMOTED_TYPE to promote __fp16 to float.  */
+/* Implement TARGET_PROMOTED_TYPE to promote 16-bit floating point types
+   to float.
+
+   __fp16 always promotes through this hook.
+   _Float16 may promote if TARGET_FLT_EVAL_METHOD is 16, but we do that
+   through the generic excess precision logic rather than here.  */
+
 static tree
 aarch64_promoted_type (const_tree t)
 {
-  if (SCALAR_FLOAT_TYPE_P (t) && TYPE_PRECISION (t) == 16)
+  if (t == aarch64_fp16_type_node)
 return float_type_node;
+
   return NULL_TREE;
 }
 
@@ -14054,6 +14061,17 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 }
 }
 
+/* Implement TARGET_LIBGCC_FLOATING_POINT_MODE_SUPPORTED_P - return TRUE
+   if MODE is HFmode, and punt to the generic implementation otherwise.  */
+
+static bool
+aarch64_libgcc_floating_mode_supported_p (machine_mode mode)
+{
+  return (mode == HFmode
+	  ? true
+	  : default_libgcc_floating_mode_supported_p (mode));
+}
+
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P - return TRUE
if MODE is HFmode, and punt to the generic implementation otherwise.  */
 
@@ -14065,6 +14083,47 @@ aarch64_scalar_mode_supported_p (machine_mode mode)
 	  : default_scalar_mode_supported_p (mode));
 }
 
+/* Set the value of FLT_EVAL_METHOD.
+   ISO/IEC TS 18661-3 defines two values that we'd like to make use of:
+
+0: evaluate all operations and constants, whose semantic type has at
+   most the range and precision of type float, to the range and
+   precision of float; evaluate all other operations and constants to
+   the range and precision of the semantic type;
+
+N, where _FloatN is a supported interchange floating type
+   evaluate all operations and constants, whose semantic type has at
+   most the range and precision of _FloatN type, to the range and
+   precision of the _FloatN type; evaluate all other operations and
+   constants to the range and precision of the semantic type;
+
+   If we have the ARMv8.2-A extensions then we support _Float16 in native
+   precision, so we should set this to 16.  Otherwise, we support the type,
+   but want to evaluate expressions in float precision, so set this to
+   0.  */
+
+static enum flt_eval_method
+aarch64_excess_precision (enum excess_precision_type type)
+{
+  switch (type)
+{
+  case EXCESS_PRECISION_TYPE_FAST:
+  case EXCESS_PRECISION_TYPE_STANDARD:
+	/* We can calculate either in 16-bit range and precision or
+	   32-bit range and precision.  Make that decision based on whether
+	   we have native support for the ARMv8.2-A 16-bit floating-point
+	   instructions or not.  */
+	return (TARGET_FP_F16INST
+		? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
+		: FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
+  case EXCESS_PRECISION_TYPE_IMP

[Patch libgcc AArch64 10/11] Enable hfmode soft-float conversions and truncations

2016-09-30 Thread James Greenhalgh

Hi,

This patch enables the conversion functions we need for AArch64's _Float16
support. To do that we need to implement TARGET_SCALAR_MODE_SUPPORTED_P,
so do that now.

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/aarch64/aarch64-c.c (aarch64_scalar_mode_supported_p): New.
(TARGET_SCALAR_MODE_SUPPORTED_P): Define.

libgcc/

2016-09-30  James Greenhalgh  

* config/aarch64/sfp-machine.h (_FP_NANFRAC_H): Define.
(_FP_NANSIGN_H): Likewise.
* config/aarch64/t-softfp (softfp_extensions): Add hftf.
(softfp_truncations): Add tfhf.
(softfp_extras): Add required conversion functions.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index df6514d..cda85de 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14054,6 +14054,17 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 }
 }
 
+/* Implement TARGET_SCALAR_MODE_SUPPORTED_P - return TRUE
+   if MODE is HFmode, and punt to the generic implementation otherwise.  */
+
+static bool
+aarch64_scalar_mode_supported_p (machine_mode mode)
+{
+  return (mode == HFmode
+	  ? true
+	  : default_scalar_mode_supported_p (mode));
+}
+
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST aarch64_address_cost
 
@@ -14264,6 +14275,9 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 #undef TARGET_RTX_COSTS
 #define TARGET_RTX_COSTS aarch64_rtx_costs_wrapper
 
+#undef TARGET_SCALAR_MODE_SUPPORTED_P
+#define TARGET_SCALAR_MODE_SUPPORTED_P aarch64_scalar_mode_supported_p
+
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate
 
diff --git a/libgcc/config/aarch64/sfp-machine.h b/libgcc/config/aarch64/sfp-machine.h
index 5efa245..da154dd 100644
--- a/libgcc/config/aarch64/sfp-machine.h
+++ b/libgcc/config/aarch64/sfp-machine.h
@@ -42,9 +42,11 @@ typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
 
 #define _FP_DIV_MEAT_Q(R,X,Y)	_FP_DIV_MEAT_2_udiv(Q,R,X,Y)
 
+#define _FP_NANFRAC_H		((_FP_QNANBIT_H << 1) - 1)
 #define _FP_NANFRAC_S		((_FP_QNANBIT_S << 1) - 1)
 #define _FP_NANFRAC_D		((_FP_QNANBIT_D << 1) - 1)
 #define _FP_NANFRAC_Q		((_FP_QNANBIT_Q << 1) - 1), -1
+#define _FP_NANSIGN_H		0
 #define _FP_NANSIGN_S		0
 #define _FP_NANSIGN_D		0
 #define _FP_NANSIGN_Q		0
diff --git a/libgcc/config/aarch64/t-softfp b/libgcc/config/aarch64/t-softfp
index 586dca2..c4ce0dc 100644
--- a/libgcc/config/aarch64/t-softfp
+++ b/libgcc/config/aarch64/t-softfp
@@ -1,8 +1,9 @@
 softfp_float_modes := tf
 softfp_int_modes := si di ti
-softfp_extensions := sftf dftf
-softfp_truncations := tfsf tfdf
+softfp_extensions := sftf dftf hftf
+softfp_truncations := tfsf tfdf tfhf
 softfp_exclude_libgcc2 := n
+softfp_extras := fixhfti fixunshfti floattihf floatuntihf
 
 TARGET_LIBGCC2_CFLAGS += -Wno-missing-prototypes
 


Re: C/C++ PATCH to implement -Wpointer-compare warning (PR c++/64767)

2016-09-30 Thread Martin Sebor

+   permerror (input_location, "ISO C++11 only allows pointer "
+  "conversions for integer literals");


FWIW, I think it would be clearer to either mention the currently
selected language version or leave it out completely rather than
hardcoding C++11.  When the user specifies -std=c++14 (or later),
or when that's the current version used by the compiler, a warning
that tells them what C++ 11 allows isn't really relevant (or
meaningful), and becomes less so as time goes on.

Martin


Re: [PATCH] Make CHECKING_P a boolean flag again

2016-09-30 Thread Jeff Law

On 09/25/2016 03:13 AM, Bernd Edlinger wrote:

Hi!


Currently CHECKING_P is not a boolean flag but a ternary option.
However the _P in the name implies it is a boolean.

That should be cleaned up again IMHO.


So this patch splits CHECKING_P into CHECKING_P and a new flag
ENABLE_EXTRA_CHECKING.  All uses of CHECKING_P are actually of
the form "if (CHECKING_P)" so there is no problem there, only the
flag_checking is tested at one place for > 1,  thus we have to
make sure that the default initial value of flag_checking is
0, 1 or 2, dependent on CHECKING_P and ENABLE_EXTRA_CHECKING.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.


changelog-checking-p.txt


2016-09-25  Bernd Edlinger  

* configure.ac: Split CHECKING_P into CHECKING_P and
ENABLE_EXTRA_CHECKING.
* configure: Regenerated.
* config.in: Adjust commment of CHECKING_P.  Add ENABLE_EXTRA_CHECKING.
* common.opt (flag_checking): Use CHECKING_P and ENABLE_EXTRA_CHECKING.

OK.

jeff



Re: [patch][fix PR other/31566] @missing_file gives bad error message

2016-09-30 Thread Jeff Law

On 09/29/2016 12:34 AM, Prasad Ghangal wrote:

Hi all,

I don't know if this is the right time to submit such patches.
But this patch attempts to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31566

I have successfully bootstrapped and tested on x86_64-pc-linux-gnu


testcases:

file:
-Wall

test.c:
void foo()
{
  int a,b;
  a = b + 1;
}

@test2:
void foo()
{
  int a,b;
  a = b + 1;
}

case 1: file and test.c are present
 ./gcc @file test.c -c
test.c: In function ‘foo’:
test.c:3:7: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
   int a,b;
   ^
test.c:4:5: warning: ‘b’ is used uninitialized in this function
[-Wuninitialized]
   a = b + 1;
   ~~^~~


case 2: file1 is not present and test.c,file are present
./gcc @file @file1 test.c -c
gcc: error: file1: No such file or directory


case 3: file is present and test1.c is not present
./gcc @file test1.c -c
gcc: error: test1.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.

case 4: file1 and test1.c both are not present
./gcc @file1 test1.c -c
gcc: error: file1: No such file or directory
gcc: error: test1.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.


case 5: @test2.c is present
./gcc @test2.c -c
-> compiled successfully without any error/warning


case 6: both file and @test2.c are present
 ./gcc @file @test2.c -c
@test2.c: In function ‘foo’:
@test2.c:3:7: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
   int a,b;
   ^
@test2.c:4:5: warning: ‘b’ is used uninitialized in this function
[-Wuninitialized]
   a = b + 1;
   ~~^~~


case 7: @file1 is not present and @test2.c is present
./gcc @file1 @test2.c -c
gcc: error: file1: No such file or directory


case 8: both @file1 and @test3.c are not present
gcc: error: file1: No such file or directory
gcc: error: test3.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.

-- in this case @test3.c is treated as a file with arguments (test3.c)

Thanks.  I fixed some minor whitespace nits and installed your fix.

Jeff



Re: [Patch 7/11] Delete TARGET_FLT_EVAL_METHOD and poison it.

2016-09-30 Thread Jeff Law

On 09/30/2016 10:56 AM, James Greenhalgh wrote:


Hi,

We've removed all uses of TARGET_FLT_EVAL_METHOD, so we can remove it
and poison it.

Bootstrapped and tested on x86-64 and AArch64. Tested on s390 and m68k
to the best of my ability (no execute tests).

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/s390/s390.h (TARGET_FLT_EVAL_METHOD): Delete.
* config/m68k/m68k.h (TARGET_FLT_EVAL_METHOD): Delete.
* config/i386/i386.h (TARGET_FLT_EVAL_METHOD): Delete.
* defaults.h (TARGET_FLT_EVAL_METHOD): Delete.
* doc/tm.texi.in (TARGET_FLT_EVAL_METHOD): Delete.
* doc/tm.texi: Regenerate.
* system.h (TARGET_FLT_EVAL_METHOD): Poison.

OK when prereqs are approved.
jeff



Re: [Patch 4/11] Implement TARGET_C_EXCESS_PRECISION for m68k

2016-09-30 Thread Jeff Law

On 09/30/2016 11:01 AM, James Greenhalgh wrote:


Hi,

This patch ports the logic from m68k's TARGET_FLT_EVAL_METHOD to the new
target hook TARGET_C_EXCESS_PRECISION.

Patch tested by building an m68k-none-elf toolchain and running
m68k.exp (without the ability to execute) with no regressions, and manually
inspecting the output assembly code when compiling
testsuite/gcc.target/i386/excess-precision* to show no difference in
code-generation.

OK?

Thanks,
James

---
gcc/

2016-09-30  James Greenhalgh  

* config/m68k/m68k.c (m68k_excess_precision): New.
(TARGET_C_EXCESS_PRECISION): Define.
OK when prereqs are approved.  Similarly for other targets where you 
needed to add this hook.


jeff


Re: [Patch 8/11] Make _Float16 available if HFmode is available

2016-09-30 Thread Jeff Law

On 09/30/2016 10:56 AM, James Greenhalgh wrote:


Hi,

Now that we've worked on -fexcess-precision, the comment in targhooks.c
no longer holds. We can now permit _Float16 on any target which provides
HFmode and supports HFmode in libgcc.

Bootstrapped and tested on x86-64, and in series on AArch64.

OK?

Thanks,
James

---
2016-09-30  James Greenhalgh  

* targhooks.c (default_floatn_mode): Enable _Float16 if a target
provides HFmode.

OK when prereqs are approved.

jeff



Re: [Patch 6/11] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

2016-09-30 Thread Joseph Myers
On Fri, 30 Sep 2016, James Greenhalgh wrote:

>/* float.h needs to know this.  */
> +  /* We already have the option -fno-fp-int-builtin-inexact to ensure
> + certain built-in functions follow TS 18661-1 semantics.  It might be
> + reasonable to have a new option to enable FLT_EVAL_METHOD using new
> + values.  However, I'd be inclined to think that such an option should
> + be on by default for -std=gnu*, only off for strict conformance modes.
> + (There would be both __FLT_EVAL_METHOD__ and __FLT_EVAL_METHOD_C99__,
> + say, predefined macros, so that  could also always use the
> + new value if __STDC_WANT_IEC_60559_TYPES_EXT__ is defined.)  */

This comment makes no sense in the context.  The comment should not be 
talking about some other option for a different issue, or about 
half-thought-out ideas for how something might be implemented; comments 
need to relate to the actual code (which in this case is obvious and not 
in need of comments beyond saying what the macro semantics are).  In any 
case, this patch does not achieve the proposed semantics, since there is 
no change to ginclude/float.h.

The goal is: if the user's options imply new FLT_EVAL_METHOD values are 
OK, *or* they defined __STDC_WANT_IEC_60559_TYPES_EXT__ before including 
, it should use the appropriate TS 18661-3 value.  Otherwise 
(strict standards modes for existing standards, no 
__STDC_WANT_IEC_60559_TYPES_EXT__) it should use a C11 value.

So in a strict standards mode you need to predefine macros with both 
choices of values and let  choose between them.  One possibility 
is: __FLT_EVAL_METHOD_C99__ is the value to use when 
__STDC_WANT_IEC_60559_TYPES_EXT__ is not defined, __FLT_EVAL_METHOD__ is 
the value to use when it is defined.  Or some other arrangement, with or 
without a macro saying what setting you have for the new option.  But you 
can't avoid changing .

Tests then should be testing the value of FLT_EVAL_METHOD from , 
*not* the internal macros predefined by the compiler.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch 5/11] Add -fpermitted-flt-eval-methods=[c11|ts-18661-3]

2016-09-30 Thread Jeff Law

On 09/30/2016 10:56 AM, James Greenhalgh wrote:


Hi,

This option is added to control which values of FLT_EVAL_METHOD the
compiler is allowed to set.

ISO/IEC TS 18661-3 defines new permissible values for
FLT_EVAL_METHOD that indicate that operations and constants with
a semantic type that is an interchange or extended format should be
evaluated to the precision and range of that type.  These new values are
a superset of those permitted under C99/C11, which does not specify the
meaning of other positive values of FLT_EVAL_METHOD.  As such, code
conforming to C11 may not have been written expecting the possibility of
the new values.

-fpermitted-flt-eval-methods specifies whether the compiler
should allow only the values of FLT_EVAL_METHOD specified in C99/C11,
or the extended set of values specified in ISO/IEC TS 18661-3.

The two possible values this option can take are "c11" or "ts-18661-3".

The default when in a standards compliant mode (-std=c11 or similar)
is -fpermitted-flt-eval-methods=c11.  The default when in a GNU
dialect (-std=gnu11 or similar) is -fpermitted-flt-eval-methods=ts-18661-3.

I've added two testcases which test that when this option, or a C standards
dialect, would restrict the range of values to {-1, 0, 1, 2}, those are
the only values we see. At this stage in the patch series this trivially
holds for all targets.

Bootstrapped on x86_64 with no issues and tested in series on AArch64.

OK?

Thanks,
James

---
gcc/c-family/

2016-09-30  James Greenhalgh  

* c-opts.c (c_common_post_options): Add logic to handle the default
case for -fpermitted-flt-eval-methods.

gcc/

2016-09-30  James Greenhalgh  

* common.opt (fpermitted-flt-eval-methods): New.
* doc/invoke.texi (-fpermitted-flt-eval-methods): Document it.
* flag_types.h (permitted_flt_eval_methods): New.

gcc/testsuite/

2016-09-30  James Greenhalgh  

* gcc.dg/fpermitted-flt-eval-methods_1.c: New.
* gcc.dg/fpermitted-flt-eval-methods_2.c: New.
OK.Are you going to need to do something for C++ (or gasp ObjC) in 
the future, or do you expect this to be C only indefinitely?


jeff






Re: [Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

2016-09-30 Thread Joseph Myers
On Fri, 30 Sep 2016, James Greenhalgh wrote:

> +  case EXCESS_PRECISION_TYPE_STANDARD:
> +  case EXCESS_PRECISION_TYPE_IMPLICIT:
> + /* Otherwise, the excess precision we want when we are
> +in a standards compliant mode, and the implicit precision we
> +provide can be identical.  */
> + return FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE;

That's wrong for EXCESS_PRECISION_TYPE_IMPLICIT.  There is no implicit 
promotion in the back end (and really there shouldn't be any excess 
precision here at all, and double_t in glibc should be fixed along with a 
GCC change to remove this mistake).

-- 
Joseph S. Myers
jos...@codesourcery.com


[C++/66443] deleted ctor and vbase construction

2016-09-30 Thread Nathan Sidwell
PR 66443 concerns C++14 DR1611. It is now permitted to use the base-ctor of an 
abstract class whos complete ctor is deleted because of a virtual base issue. 
Specifically, given


class A {
  A (int);
  // no default ctor in C++14
};

class B : virtual A {
 virtual void Foo () = 0; // abstract
 // B::B deleted because there's no A::A()
};

class C: B
{
  virtual void Foo (); // not abstract

  C ()
   : A(1) // explicit vbase construction
  {}
};

Here, there's no way that B's complete object constructor can be called -- it's 
an abstract class, so no complete B objects can exist.  B's as-a-base 
constructor never constructs the A base, because its a virtual base.  So the 
missing 'A::A()' is never needed.  C's constructors must explicitly construct 
the A base using the available constructors.


the trickiness is that we create both base and complete constructors by building 
a more generic constructor, cloning it substituting the 'in-charge' parameter 
value and then allowing optimization to remove unreachable code in the two 
instances.  (see below for why not have the complete-ctor tail call the base-ctor.)


This patch adds a new FUNCTION_DECL flag and accessor 'DECL_BASE_FN_UNDELETED'. 
In the example above, this is true for B's generic constructor.  When we 
actually need a B constructor (the bodies are created lazily), we check 
DECL_BASE_FN_UNDELETED to see whether we need to do something 'interesting'.  If 
we do, and its the base-ctor we don't do the usual 'create generic body, clone 
it' scheme.  Instead we create the base body explicitly, passing an new 
'skip-vbases' flag, and thus never try and find 'A::A()' in this case. 
build_if_incharge needs tweaking, because the incharge parm only exists on the 
generic ctor.


bootstrapped on x86_64-linux, ok?


nathan

[*] why not tail-call the as-base ctor after constructing the vbases?  That 
would be fine for most cases, but we might need to provide a vtt pointer 
describing the complete object.  Not sure if such a table exists right now. 
Anyway the more serios issue is varadic ctors.  [without additional compiler 
magic] we can't tailcall from the complete ctor to the base ctor.
2016-09-30  Nathan Sidwell  

	cp/
	PR c++/66443
	* cp-tree.h (lang_decl_fn): Add base_undeleted_p.
	(DECL_BASE_UNDELETED_FN): New.
	(emit_mem_initializer, finish_mem_initializers): Add skip vbases arg.
	* class.c (build_if_incharge): Allow no incharge parm when
	building undeleted base dtor directly.
	* decl2.c (mark_used): Don't complain about using an undeleted
	base ctor.
	* init.c (emit_mem_initializers): Add skip_vbases arg. Skip vbase
	construction if true.
	* method.c (do_build_copy_constructor): Add skip_vbases arg, pass
	it through.
	(synthesize_method): Skip vbases when we have a
	(synthesized_method_walk): Add vbase_deleted_p argument.  Use it.
	(get_defaulted_eh_spec): Adjust synthesized_method_walk.
	(maybe_explain_implicit_delete, explain_implicit_non_constexpr,
	deduce_intheriting_ctor): Likewise.
	(implicitly_declare_fn): Determine DECL_BASE_UNDELETED_FN.
	* parser.c (cp_parser_ctor_initializer,
	cp_parser_mem_initializer_list): Adjust finish_mem_initializer call.
	* pt.c (tsubst_expr): Likewise.
	* semantics.c (finish_mem_initializers): Add skip_vbase parm, pass
	to emit_mem_initializers.

	testsuite.
	PR c++/66443
	* g++.dg/cpp0x/pr66443-cxx11.C: New.
	* g++.dg/cpp1y/pr66443-cxx14.C: New
	* g++.dg/cpp1y/pr66443-cxx14-2.C: New.

Index: cp/class.c
===
--- cp/class.c	(revision 240596)
+++ cp/class.c	(working copy)
@@ -233,15 +233,27 @@ int n_inner_fields_searched = 0;
 tree
 build_if_in_charge (tree true_stmt, tree false_stmt)
 {
-  gcc_assert (DECL_HAS_IN_CHARGE_PARM_P (current_function_decl));
-  tree cmp = build2 (NE_EXPR, boolean_type_node,
-		 current_in_charge_parm, integer_zero_node);
-  tree type = unlowered_expr_type (true_stmt);
-  if (VOID_TYPE_P (type))
-type = unlowered_expr_type (false_stmt);
-  tree cond = build3 (COND_EXPR, type,
-		  cmp, true_stmt, false_stmt);
-  return cond;
+  tree result;
+  
+  if (DECL_HAS_IN_CHARGE_PARM_P (current_function_decl))
+{
+  tree cmp = build2 (NE_EXPR, boolean_type_node,
+			 current_in_charge_parm, integer_zero_node);
+  tree type = unlowered_expr_type (true_stmt);
+  if (VOID_TYPE_P (type))
+	type = unlowered_expr_type (false_stmt);
+  result = build3 (COND_EXPR, type, cmp, true_stmt, false_stmt);
+}
+  else
+{
+  /* We must be building an undeleted base ctor.  */
+  gcc_assert (DECL_BASE_CONSTRUCTOR_P (current_function_decl)
+		  && DECL_DELETED_FN (current_function_decl)
+		  && DECL_BASE_UNDELETED_FN (current_function_decl));
+  result = false_stmt;
+}
+
+  return result;
 }
 
 /* Convert to or from a base subobject.  EXPR is an expression of type
Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 240596)
+++ cp/cp-t

Re: [Patch 3/11] Implement TARGET_C_EXCESS_PRECISION for s390

2016-09-30 Thread Jeff Law

On 09/30/2016 11:34 AM, Joseph Myers wrote:

On Fri, 30 Sep 2016, James Greenhalgh wrote:


+  case EXCESS_PRECISION_TYPE_STANDARD:
+  case EXCESS_PRECISION_TYPE_IMPLICIT:
+   /* Otherwise, the excess precision we want when we are
+  in a standards compliant mode, and the implicit precision we
+  provide can be identical.  */
+   return FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE;


That's wrong for EXCESS_PRECISION_TYPE_IMPLICIT.  There is no implicit
promotion in the back end (and really there shouldn't be any excess
precision here at all, and double_t in glibc should be fixed along with a
GCC change to remove this mistake).

Sorry, change to a NAK.

Joseph, what's the right thing to do here?

jeff



[PATCH 1/3] Remove support for obsolete x86 -malign-foo options

2016-09-30 Thread Denys Vlasenko
2016-09-27  Denys Vlasenko  

* config/i386/i386-common.c (ix86_handle_option): Remove support
for obsolete -malign-loops, -malign-jumps and -malign-functions
options.
* config/i386/i386.opt: Likewise.

Index: gcc/common/config/i386/i386-common.c
===
--- gcc/common/config/i386/i386-common.c(revision 240663)
+++ gcc/common/config/i386/i386-common.c(working copy)
@@ -998,38 +998,6 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;
 
-
-  /* Comes from final.c -- no real reason to change it.  */
-#define MAX_CODE_ALIGN 16
-
-case OPT_malign_loops_:
-  warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
-  if (value > MAX_CODE_ALIGN)
-   error_at (loc, "-malign-loops=%d is not between 0 and %d",
- value, MAX_CODE_ALIGN);
-  else
-   opts->x_align_loops = 1 << value;
-  return true;
-
-case OPT_malign_jumps_:
-  warning_at (loc, 0, "-malign-jumps is obsolete, use -falign-jumps");
-  if (value > MAX_CODE_ALIGN)
-   error_at (loc, "-malign-jumps=%d is not between 0 and %d",
- value, MAX_CODE_ALIGN);
-  else
-   opts->x_align_jumps = 1 << value;
-  return true;
-
-case OPT_malign_functions_:
-  warning_at (loc, 0,
- "-malign-functions is obsolete, use -falign-functions");
-  if (value > MAX_CODE_ALIGN)
-   error_at (loc, "-malign-functions=%d is not between 0 and %d",
- value, MAX_CODE_ALIGN);
-  else
-   opts->x_align_functions = 1 << value;
-  return true;
-
 case OPT_mbranch_cost_:
   if (value > 5)
{
Index: gcc/config/i386/i386.opt
===
--- gcc/config/i386/i386.opt(revision 240663)
+++ gcc/config/i386/i386.opt(working copy)
@@ -205,18 +205,6 @@ malign-double
 Target Report Mask(ALIGN_DOUBLE) Save
 Align some doubles on dword boundary.
 
-malign-functions=
-Target RejectNegative Joined UInteger
-Function starts are aligned to this power of 2.
-
-malign-jumps=
-Target RejectNegative Joined UInteger
-Jump targets are aligned to this power of 2.
-
-malign-loops=
-Target RejectNegative Joined UInteger
-Loop code aligned to this power of 2.
-
 malign-stringops
 Target RejectNegative Report InverseMask(NO_ALIGN_STRINGOPS, ALIGN_STRINGOPS) 
Save
 Align destination of the string operations.


[PATCH 0/3] Extend -falign-FOO=N to N[,M[,N2[,M2]]] version 4

2016-09-30 Thread Denys Vlasenko
These patches are for this bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66240
"RFE: extend -falign-xyz syntax"

The test program:

int g();
int f(int i) {
i *= 3;
while (--i > 100) {
 L1:if (g()) goto L1;
if (g()) goto L2;
}
return i;
 L2:return 123;
}

"-O2" assembly before the patch:After the patch:
.text   .text
.p2align 4,,15  .p2align 4
.globl  f   .globl  f
.type   f, @function.type   f, @function
f:  f:
.LFB0:  .LFB0:
pushq   %rbxpushq   %rbx
leal(%rdi,%rdi,2), %ebx leal(%rdi,%rdi,2), %ebx
.p2align 4,,10  .p2align 4,,10
.p2align 3  .p2align 3
.L2:.L2:
subl$1, %ebxsubl$1, %ebx
cmpl$100, %ebx  cmpl$100, %ebx
jle .L1 jle .L1
.p2align 4,,10  .p2align 4,,10
.p2align 3  .p2align 3
.L3:.L3:
xorl%eax, %eax  xorl%eax, %eax
callg   callg
testl   %eax, %eax  testl   %eax, %eax
jne .L3 jne .L3
callg   callg
testl   %eax, %eax  testl   %eax, %eax
je  .L2 je  .L2
movl$123, %ebx  movl$123, %ebx
.L4:.L4:
.L1:.L1:
movl%ebx, %eax  movl%ebx, %eax
popq%rbxpopq%rbx
ret ret

This is version 5 of the patch set.

Bernd asked to replace use of a new SUBALIGN_LOG define with a hook.
Don't see an easy way to do that (short of adding a dedicated hook),
for now retained SUBALIGN_LOG method. Suggestions welcome.

Changes since version 4:

* Deleted rather than NOPed -malign-foo=N support.
* Improved behavior match with x86 8-byte subalignment for labels.

Changes since version 3:

* Improved documentation in invoke.texi
* Fixed x86-specific calculation of default N2 value:
  previous version was doing it incorrectly for cross-compile


  1   2   >