Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Jakub Jelinek
On Tue, Nov 07, 2017 at 11:08:58AM +0100, Martin Liška wrote:
> > Hasn't it enabled it also for any other FEs other than C family and Fortran?
> > Say jit, brig, go, lto?, ...
> > I think better would be to remove the initialization to -1 and revert the
> > fortran/options.c change, and instead use in the C family:
> >   if (!global_options_set.x_warn_return_type)
> > warn_return_type = c_dialect_cxx ();
> > 
> > Unless it for some reason doesn't work for -Wall or -W or similar.
> > 
> 
> Hello.
> 
> Sorry for the inconvenience, however using Jakub's approach really does not 
> work properly
> with -Wall.

If -Wall had an underlying variable, then we could use:
  if (!global_options_set.x_warn_return_type
  && !global_options_set.x_warn_all)
warn_return_type = c_dialect_cxx ();

But we don't.  Wonder if in addition to your patch or instead of it it
wouldn't be safer (especially for FEs added in the future) to:
 
   /* If we see "return;" in some basic block, then we do reach the end
  without returning a value.  */
-  else if (warn_return_type
+  else if (warn_return_type > 0
&& !TREE_NO_WARNING (fun->decl)
&& EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0
&& !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fun->decl

in tree-cfg.c.  That change is preapproved if it works, and your
patch if you want in addition to that is ok too.

Jakub


Re: [PATCH] Fix use-after-free in the strlen pass (PR tree-optimization/82977)

2017-11-15 Thread Jakub Jelinek
On Tue, Nov 14, 2017 at 04:46:01PM -0700, Martin Sebor wrote:
> How about at least detecting the problem then?  The attached patch
> catches the bug while running the Wstringop-truncation tests and
> passes x86_64 bootstrap.

Well, IMHO then the extra argument should be there only #if CHECKING_P,
so that you don't grow or slow down --enable-checking=release,
and more importantly, 

>  template class Allocator>
>  void
> -hash_table::expand ()
> +hash_table::expand (const void *ptr /* = NULL */)
>  {
>value_type *oentries = m_entries;
>unsigned int oindex = m_size_prime_index;
> @@ -718,6 +721,15 @@ hash_table::expand ()
>value_type *olimit = oentries + osize;
>size_t elts = elements ();
>  
> +#if CHECKING_P
> +  /* Verify that the pointer doesn't point into the table to detect
> + insertions of existing elements.  */
> +  uintptr_t iptr = (uintptr_t)ptr;
> +  uintptr_t ibeg = (uintptr_t)oentries;
> +  uintptr_t iend = (uintptr_t)olimit;
> +  gcc_checking_assert (iptr < ibeg || iend < iptr);
> +#endif
> +

This is the wrong spot to check it.

>/* Resize only when table after removal of unused elements is either
>   too full or too empty.  */
>unsigned int nindex;
> @@ -866,17 +878,22 @@ hash_table
> HASH.  To delete an entry, call this with insert=NO_INSERT, then
> call clear_slot on the slot returned (possibly after doing some
> checks).  To insert an entry, call this with insert=INSERT, then
> -   write the value you want into the returned slot.  When inserting an
> -   entry, NULL may be returned if memory allocation fails. */
> +   write the value you want into the returned slot.  When inserting
> +   an entry, NULL may be returned if memory allocation fails.
> +   If PTR points into an element already in the table and the table
> +   is expanded, the function aborts.  This makes it possible to
> +   detect insertions of elements that are already in the table and
> +   references to which would be invalidated by the reallocation that
> +   results from the insertion.  */
>  
>  template class Allocator>
>  typename hash_table::value_type *
>  hash_table
>  ::find_slot_with_hash (const compare_type &comparable, hashval_t hash,
> -enum insert_option insert)
> +enum insert_option insert, const void *ptr)
>  {
>if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
> -expand ();
> +expand (ptr);
>  
>m_searches++;

It should be checked here.
  if (insert == INSERT)
{
#if CHECKING_P
  ...
#endif
  if (m_size * 3 <= m_n_elements * 4)
expand ();
}
because otherwise you'll find the problem only if you are unlucky enough
that the hash table is expanded, while checking it this way would ensure
that there are no such potential problems.

Jakub


Re: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Richard Biener
On Tue, 14 Nov 2017, Tamar Christina wrote:

> Hi All,
> 
> This patch allows larger bitsizes to be used as copy size
> when the target does not have SLOW_UNALIGNED_ACCESS.
> 
> fun3:
>   adrpx2, .LANCHOR0
>   add x2, x2, :lo12:.LANCHOR0
>   mov x0, 0
>   sub sp, sp, #16
>   ldrhw1, [x2, 16]
>   ldrbw2, [x2, 18]
>   add sp, sp, 16
>   bfi x0, x1, 0, 8
>   ubfxx1, x1, 8, 8
>   bfi x0, x1, 8, 8
>   bfi x0, x2, 16, 8
>   ret
> 
> is turned into
> 
> fun3:
>   adrpx0, .LANCHOR0
>   add x0, x0, :lo12:.LANCHOR0
>   sub sp, sp, #16
>   ldrhw1, [x0, 16]
>   ldrbw0, [x0, 18]
>   strhw1, [sp, 8]
>   strbw0, [sp, 10]
>   ldr w0, [sp, 8]
>   add sp, sp, 16
>   ret
> 
> which avoids the bfi's for a simple 3 byte struct copy.
> 
> Regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and no 
> regressions.
> 
> This patch is just splitting off from the previous combined patch with 
> AArch64 and adding
> a testcase.
> 
> I assume Jeff's ACK from 
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still
> valid as the code did not change.

Given your no_slow_unalign isn't mode specific can't you use the existing
non_strict_align?

Otherwise the expr.c change looks ok.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> 
> gcc/
> 2017-11-14  Tamar Christina  
> 
>   * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
>   with fast unaligned access.
>   * doc/sourcebuild.texi (no_slow_unalign): New.
>   
> gcc/testsuite/
> 2017-11-14  Tamar Christina  
> 
>   * gcc.dg/struct-simple.c: New.
>   * lib/target-supports.exp
>   (check_effective_target_no_slow_unalign): New.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Fix use-after-free in the strlen pass (PR tree-optimization/82977)

2017-11-15 Thread Richard Biener
On Tue, 14 Nov 2017, Jeff Law wrote:

> On 11/14/2017 02:30 PM, Jakub Jelinek wrote:
> > On Tue, Nov 14, 2017 at 02:24:28PM -0700, Martin Sebor wrote:
> >> On 11/14/2017 02:04 PM, Jakub Jelinek wrote:
> >>> Hi!
> >>>
> >>> strlen_to_stridx.get (rhs1) returns an address into the hash_map, and
> >>> strlen_to_stridx.put (lhs, *ps); (in order to be efficient) doesn't make a
> >>> copy of the argument just in case, first inserts the slot into it which
> >>> may cause reallocation, and only afterwards runs the copy ctor to assign
> >>> the value into the new slot.  So, passing it a reference to something
> >>> in the hash_map is wrong.  Fixed thusly, bootstrapped/regtested on
> >>> x86_64-linux and i686-linux, ok for trunk?
> >>
> >> This seems like an unnecessary gotcha that should be possible to
> >> avoid in the hash_map.  The corresponding standard containers
> >> require it to work and so it's surprising when it doesn't in GCC.
> >>
> >> I've been looking at how this is implemented and it seems to me
> >> that a fix should be doable by having the hash_map check to see if
> >> the underlying table needs to expand and if so, create a temporary
> >> copy of the element before reallocating it.
> > 
> > That would IMHO just slow down and enlarge the hash_map for all users,
> > even when most of them don't really need it.
> > While it is reasonable for STL containers to make sure it works, we
> > aren't using STL containers and can pose additional restrictions.
> But when we make our containers behave differently than the STL it makes
> it much easier for someone to make a mistake such as this one.
> 
> IMHO this kind of difference in behavior is silly and long term just
> makes our jobs harder.
> 
> I'd vote for fixing our containers.

I'd argue that this is simply a programming error and I doubt the
libstdc++ variant works by design/specification.

So let's go with Jakubs patch.  We can do the extra checking as followup.

Jakub, your patch is ok.

Thanks,
Richard.


Re: [PATCH] Small expand_mul_overflow improvement (PR target/82981)

2017-11-15 Thread Richard Biener
On Tue, 14 Nov 2017, Jakub Jelinek wrote:

> Hi!
> 
> For targets that don't have {,u}mulv4 insn we try 3 different
> expansions of the basic signed * signed -> signed or unsigned * unsigned ->
> unsigned overflow computation.  The first one is done if 
>if (GET_MODE_2XWIDER_MODE (mode).exists (&wmode)
>  && targetm.scalar_mode_supported_p (wmode))
> and we emit a WIDEN_MULT_EXPR followed by extraction of the hipart
> from it (for testing overflow if both unsigned and signed) and
> lowpart (result of the multiplication and for signed overflow testing
> where we use MSB of it).  This case is meant for use by smaller modes,
> e.g. subword, where it is generally pretty efficient.  Unfortunately
> on some targets, e.g. mips 64-bit, where the is no widening mult
> optab it can be expanded as a libcall on the full wmode operands,
> which is slow and causes problems e.g. to some freestanding environments
> like Linux kernel that don't bother to link in libgcc.a or replacement
> thereof.  Then there is another case, usually pretty large, with usually
> two but sometimes one multiplication, and various conditionals, shifts, etc.
> meant primarily for the widest supported mode.  And the last fallback
> is just doing multiplication and never computing overflow, hopefully it
> is never used at least on sane targets.
> 
> This patch attempts to check if we'd emit WIDEN_MULT_EXPR as a libcall
> and in that case tries to use the other possibilities, and only falls
> back to the WIDEN_MULT_EXPR with a libcall if we'd otherwise use the
> last fallback without overflow computation.
> 
> In addition to it, it adds support for targets that have supported
> MULT_HIGHPART_EXPR for that mode, by doing pretty much what the
> WIDEN_MULT_EXPR case does, but instead of doing one multiplication
> to compute both lowpart and highpart and then shifts to split those
> we use one multiplication to compute the lowpart and one MULT_HIGHPART_EXPR
> to compute the highpart.  In theory this method doesn't have to be always
> faster than the one with hmode, because the MULT_HIGHPART_EXPR case does
> 2 multiplications plus one comparison, while the hmode code does sometimes
> just one, but it is significantly shorter, fewer conditionals/branches
> so I think it should be generally a win (if it turns out not to be the
> case on some target, we could restrict it to -Os only or whatever).
> 
> And lastly, the MULT_HIGHPART_EXPR case can actually be the optimal code
> if we are only checking for the overflow and don't actually need the
> multiplication value, it is unsigned multiply and we don't need any
> res using code afterwards; in that case the low part multiply can be DCEd
> and only the highpart multiply + comparison will remain.  So, the patch
> adds check for single IMAGPART_EXPR use and other conditions and uses
> the MULT_HIGHPART_EXPR code in preference of the WIDEN_MULT_EXPR in that
> case.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
> testcase with cross to mips, ok for trunk?

Ok, but can you add the testcase for the kernel issue?

Thanks,
Richard.

> 2017-11-14  Jakub Jelinek  
> 
>   PR target/82981
>   * internal-fn.c: Include gimple-ssa.h, tree-phinodes.h and
>   ssa-iterators.h.
>   (can_widen_mult_without_libcall): New function.
>   (expand_mul_overflow): If only checking unsigned mul overflow,
>   not result, and can do efficiently MULT_HIGHPART_EXPR, emit that.
>   Don't use WIDEN_MULT_EXPR if it would involve a libcall, unless
>   no other way works.  Add MULT_HIGHPART_EXPR + MULT_EXPR support.
>   (expand_DIVMOD): Formatting fix.
>   * expmed.h (expand_mult): Add NO_LIBCALL argument.
>   * expmed.c (expand_mult): Likewise.  Use OPTAB_WIDEN rather
>   than OPTAB_LIB_WIDEN if NO_LIBCALL is true, and allow it to fail.
> 
> --- gcc/internal-fn.c.jj  2017-10-23 10:13:08.0 +0200
> +++ gcc/internal-fn.c 2017-11-14 16:48:25.414403348 +0100
> @@ -46,6 +46,9 @@ along with GCC; see the file COPYING3.
>  #include "recog.h"
>  #include "builtins.h"
>  #include "optabs-tree.h"
> +#include "gimple-ssa.h"
> +#include "tree-phinodes.h"
> +#include "ssa-iterators.h"
>  
>  /* The names of each internal function, indexed by function number.  */
>  const char *const internal_fn_name_array[] = {
> @@ -1172,6 +1175,35 @@ expand_neg_overflow (location_t loc, tre
>  }
>  }
>  
> +/* Return true if UNS WIDEN_MULT_EXPR with result mode WMODE and operand
> +   mode MODE can be expanded without using a libcall.  */
> +
> +static bool
> +can_widen_mult_without_libcall (scalar_int_mode wmode, scalar_int_mode mode,
> + rtx op0, rtx op1, bool uns)
> +{
> +  if (find_widening_optab_handler (umul_widen_optab, wmode, mode)
> +  != CODE_FOR_nothing)
> +return true;
> +
> +  if (find_widening_optab_handler (smul_widen_optab, wmode, mode)
> +  != CODE_FOR_nothing)
> +return true;
> +
> +  rtx_insn *last = 

Re: [PATCH][AArch64] Improve scheduling model for X-Gene

2017-11-15 Thread Dominik Inführ
Could you please commit it for me? I don’t have commit rights.

Thanks,

Dominik

> On 13 Nov 2017, at 12:27, Kyrill Tkachov  wrote:
> 
> 
> On 13/11/17 11:09, Dominik Inführ wrote:
>> Oh sure, I've now successfully bootstrapped on arm-linux-gnueabihf and 
>> aarch64-unknown-linux-gnu.
>> 
>> Dominik
>> 
> 
> Thanks Dominik,
> 
> This is ok for trunk.
> 
> Kyrill
> 
>>> On 10 Nov 2017, at 10:53, Kyrill Tkachov  
>>> wrote:
>>> 
>>> Hi Dominic,
>>> 
>>> On 10/11/17 09:36, Dominik Inführ wrote:
 Hi,
 
 this patch tries to refine the instruction scheduling model for X-Gene. 
 Improved performance for 456.hmmer and 464.h264ref (about 1%). Also splits 
 the model into multiple automatons, therefore smaller binary and faster 
 build time. Survives bootstrap.
 
 Best,
 Dominik
>>> The changes look ok to me, but as the description is shared between the arm 
>>> and aarch64 ports can you please also do a sanity check
>>> by building (and preferably bootstrapping) an arm compiler?
>>> 
>>> Thanks,
>>> Kyrill
>>> 
 gcc/ChangeLog:
 2017-10-09  Dominik Infuehr 
 
* config/arm/xgene1.md (xgene1): Split into automatons
xgene1_main, xgene1_decoder, xgene1_div, xgene1_simd.
(xgene1_f_load): Adjust reservations and/or types.
(xgene1_f_store): Likewise.
(xgene1_load_pair): Likewise.
(xgene1_store_pair): Likewise.
(xgene1_fp_load1): Likewise.
(xgene1_load1): Likewise.
(xgene1_store1): Likewise.
(xgene1_move): Likewise.
(xgene1_alu): Likewise.
(xgene1_simd): Likewise.
(xgene1_bfm): Likewise.
(xgene1_neon_load1): Likewise.
(xgene1_neon_store1): Likewise.
(xgene1_neon_logic): Likewise.
(xgene1_neon_st1): Likewise.
(xgene1_neon_ld1r): Likewise.
(xgene1_alu_cond): Added.
(xgene1_shift_reg): Likwise.
(xgene1_bfx): Likewise.
(xgene1_mul): Split into xgene1_mul32, xgene1_mul64.
 
 —
 diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
 index c4b3773..cf0694a 100644
 --- a/gcc/config/arm/xgene1.md
 +++ b/gcc/config/arm/xgene1.md
 @@ -20,17 +20,26 @@
 
  ;; Pipeline description for the xgene1 micro-architecture
 
 -(define_automaton "xgene1")
 +(define_automaton "xgene1_main, xgene1_decoder, xgene1_div, xgene1_simd")
 
 -(define_cpu_unit "xgene1_decode_out0" "xgene1")
 -(define_cpu_unit "xgene1_decode_out1" "xgene1")
 -(define_cpu_unit "xgene1_decode_out2" "xgene1")
 -(define_cpu_unit "xgene1_decode_out3" "xgene1")
 +(define_cpu_unit "xgene1_decode_out0" "xgene1_decoder")
 +(define_cpu_unit "xgene1_decode_out1" "xgene1_decoder")
 +(define_cpu_unit "xgene1_decode_out2" "xgene1_decoder")
 +(define_cpu_unit "xgene1_decode_out3" "xgene1_decoder")
 
 -(define_cpu_unit "xgene1_divide" "xgene1")
 -(define_cpu_unit "xgene1_fp_divide" "xgene1")
 -(define_cpu_unit "xgene1_fsu" "xgene1")
 -(define_cpu_unit "xgene1_fcmp" "xgene1")
 +(define_cpu_unit "xgene1_IXA" "xgene1_main")
 +(define_cpu_unit "xgene1_IXB" "xgene1_main")
 +(define_cpu_unit "xgene1_IXB_compl" "xgene1_main")
 +
 +(define_reservation "xgene1_IXn" "(xgene1_IXA | xgene1_IXB)")
 +
 +(define_cpu_unit "xgene1_multiply" "xgene1_main")
 +(define_cpu_unit "xgene1_divide" "xgene1_div")
 +(define_cpu_unit "xgene1_fp_divide" "xgene1_div")
 +(define_cpu_unit "xgene1_fsu" "xgene1_simd")
 +(define_cpu_unit "xgene1_fcmp" "xgene1_simd")
 +(define_cpu_unit "xgene1_ld" "xgene1_main")
 +(define_cpu_unit "xgene1_st" "xgene1_main")
 
  (define_reservation "xgene1_decode1op"
  "( xgene1_decode_out0 )
 @@ -68,12 +77,12 @@
  (define_insn_reservation "xgene1_f_load" 10
(and (eq_attr "tune" "xgene1")
 (eq_attr "type" "f_loadd,f_loads"))
 -  "xgene1_decode2op")
 +  "xgene1_decode2op, xgene1_ld")
 
  (define_insn_reservation "xgene1_f_store" 4
(and (eq_attr "tune" "xgene1")
 (eq_attr "type" "f_stored,f_stores"))
 -  "xgene1_decode2op")
 +  "xgene1_decode2op, xgene1_st")
 
  (define_insn_reservation "xgene1_fmov" 2
(and (eq_attr "tune" "xgene1")
 @@ -92,85 +101,108 @@
 
  (define_insn_reservation "xgene1_load_pair" 6
(and (eq_attr "tune" "xgene1")
 -   (eq_attr "type" "load_8, load_16"))
 -  "xgene1_decodeIsolated")
 +   (eq_attr "type" "load_16"))
 +  "xgene1_decodeIsolated, xgene1_ld*2")
 
  (define_insn_reservation "xgene1_store_pair" 2
(and (eq_attr "tune" "xgene1")
 -   (eq_attr "type" "store_8, store_16"))
 -  "xgene1_decodeIsolated")
 +   (eq_attr "type" "store_16"))
 +  "xgene1_decodeIsolated, xgene1_st*2")
 
  (define_insn_reservation "xgene1_fp_load1" 10
(and (eq_attr "tune" "xgene1")
 -   (eq_attr "typ

Re: [PATCH] Zero vptr in dtor for -fsanitize=vptr.

2017-11-15 Thread Martin Liška
Thanks for review. I actually noticed your introduction of
cp_build_fold_indirect_ref after I installed my patch.

I'm testing following fix.

Martin
>From 63d9cff5c183f3614cff527ff991e1586a9efa5b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 10:01:51 +0100
Subject: [PATCH] Fix fallout of -fsanitize=vptr.

gcc/cp/ChangeLog:

2017-11-15  Martin Liska  

	* decl.c (begin_destructor_body): Use cp_build_fold_indirect_ref
	instead of cp_build_indirect_ref.
---
 gcc/cp/decl.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 041893db937..7e16f7b415b 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -15253,8 +15253,7 @@ begin_destructor_body (void)
 	  {
 	tree binfo = TYPE_BINFO (current_class_type);
 	tree ref
-	  = cp_build_indirect_ref (current_class_ptr, RO_NULL,
-   tf_warning_or_error);
+	  = cp_build_fold_indirect_ref (current_class_ptr);
 
 	tree vtbl_ptr = build_vfield_ref (ref, TREE_TYPE (binfo));
 	tree vtbl = build_zero_cst (TREE_TYPE (vtbl_ptr));
-- 
2.14.3



Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-11-15 Thread Pierre-Marie de Rodat

Hello Richard,

On 09/25/2017 01:54 PM, Richard Biener wrote:

Ok for trunk and gcc-7 branch after a while.

Is it still okay to commit to gcc-7, now?

--
Pierre-Marie de Rodat


Re: [PATCH][AArch64] Improve scheduling model for X-Gene

2017-11-15 Thread Kyrill Tkachov


On 15/11/17 08:49, Dominik Inführ wrote:

Could you please commit it for me? I don’t have commit rights.


Ah, of course.
Committed with r254759.

Thanks,
Kyrill


Thanks,

Dominik


On 13 Nov 2017, at 12:27, Kyrill Tkachov  wrote:


On 13/11/17 11:09, Dominik Inführ wrote:

Oh sure, I've now successfully bootstrapped on arm-linux-gnueabihf and 
aarch64-unknown-linux-gnu.

Dominik


Thanks Dominik,

This is ok for trunk.

Kyrill


On 10 Nov 2017, at 10:53, Kyrill Tkachov  wrote:

Hi Dominic,

On 10/11/17 09:36, Dominik Inführ wrote:

Hi,

this patch tries to refine the instruction scheduling model for X-Gene. 
Improved performance for 456.hmmer and 464.h264ref (about 1%). Also splits the 
model into multiple automatons, therefore smaller binary and faster build time. 
Survives bootstrap.

Best,
Dominik

The changes look ok to me, but as the description is shared between the arm and 
aarch64 ports can you please also do a sanity check
by building (and preferably bootstrapping) an arm compiler?

Thanks,
Kyrill


gcc/ChangeLog:
2017-10-09  Dominik Infuehr 

* config/arm/xgene1.md (xgene1): Split into automatons
xgene1_main, xgene1_decoder, xgene1_div, xgene1_simd.
(xgene1_f_load): Adjust reservations and/or types.
(xgene1_f_store): Likewise.
(xgene1_load_pair): Likewise.
(xgene1_store_pair): Likewise.
(xgene1_fp_load1): Likewise.
(xgene1_load1): Likewise.
(xgene1_store1): Likewise.
(xgene1_move): Likewise.
(xgene1_alu): Likewise.
(xgene1_simd): Likewise.
(xgene1_bfm): Likewise.
(xgene1_neon_load1): Likewise.
(xgene1_neon_store1): Likewise.
(xgene1_neon_logic): Likewise.
(xgene1_neon_st1): Likewise.
(xgene1_neon_ld1r): Likewise.
(xgene1_alu_cond): Added.
(xgene1_shift_reg): Likwise.
(xgene1_bfx): Likewise.
(xgene1_mul): Split into xgene1_mul32, xgene1_mul64.

—
diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
index c4b3773..cf0694a 100644
--- a/gcc/config/arm/xgene1.md
+++ b/gcc/config/arm/xgene1.md
@@ -20,17 +20,26 @@

  ;; Pipeline description for the xgene1 micro-architecture

-(define_automaton "xgene1")
+(define_automaton "xgene1_main, xgene1_decoder, xgene1_div, xgene1_simd")

-(define_cpu_unit "xgene1_decode_out0" "xgene1")
-(define_cpu_unit "xgene1_decode_out1" "xgene1")
-(define_cpu_unit "xgene1_decode_out2" "xgene1")
-(define_cpu_unit "xgene1_decode_out3" "xgene1")
+(define_cpu_unit "xgene1_decode_out0" "xgene1_decoder")
+(define_cpu_unit "xgene1_decode_out1" "xgene1_decoder")
+(define_cpu_unit "xgene1_decode_out2" "xgene1_decoder")
+(define_cpu_unit "xgene1_decode_out3" "xgene1_decoder")

-(define_cpu_unit "xgene1_divide" "xgene1")
-(define_cpu_unit "xgene1_fp_divide" "xgene1")
-(define_cpu_unit "xgene1_fsu" "xgene1")
-(define_cpu_unit "xgene1_fcmp" "xgene1")
+(define_cpu_unit "xgene1_IXA" "xgene1_main")
+(define_cpu_unit "xgene1_IXB" "xgene1_main")
+(define_cpu_unit "xgene1_IXB_compl" "xgene1_main")
+
+(define_reservation "xgene1_IXn" "(xgene1_IXA | xgene1_IXB)")
+
+(define_cpu_unit "xgene1_multiply" "xgene1_main")
+(define_cpu_unit "xgene1_divide" "xgene1_div")
+(define_cpu_unit "xgene1_fp_divide" "xgene1_div")
+(define_cpu_unit "xgene1_fsu" "xgene1_simd")
+(define_cpu_unit "xgene1_fcmp" "xgene1_simd")
+(define_cpu_unit "xgene1_ld" "xgene1_main")
+(define_cpu_unit "xgene1_st" "xgene1_main")

  (define_reservation "xgene1_decode1op"
  "( xgene1_decode_out0 )
@@ -68,12 +77,12 @@
  (define_insn_reservation "xgene1_f_load" 10
(and (eq_attr "tune" "xgene1")
 (eq_attr "type" "f_loadd,f_loads"))
-  "xgene1_decode2op")
+  "xgene1_decode2op, xgene1_ld")

  (define_insn_reservation "xgene1_f_store" 4
(and (eq_attr "tune" "xgene1")
 (eq_attr "type" "f_stored,f_stores"))
-  "xgene1_decode2op")
+  "xgene1_decode2op, xgene1_st")

  (define_insn_reservation "xgene1_fmov" 2
(and (eq_attr "tune" "xgene1")
@@ -92,85 +101,108 @@

  (define_insn_reservation "xgene1_load_pair" 6
(and (eq_attr "tune" "xgene1")
-   (eq_attr "type" "load_8, load_16"))
-  "xgene1_decodeIsolated")
+   (eq_attr "type" "load_16"))
+  "xgene1_decodeIsolated, xgene1_ld*2")

  (define_insn_reservation "xgene1_store_pair" 2
(and (eq_attr "tune" "xgene1")
-   (eq_attr "type" "store_8, store_16"))
-  "xgene1_decodeIsolated")
+   (eq_attr "type" "store_16"))
+  "xgene1_decodeIsolated, xgene1_st*2")

  (define_insn_reservation "xgene1_fp_load1" 10
(and (eq_attr "tune" "xgene1")
-   (eq_attr "type" "load_4")
+   (eq_attr "type" "load_4, load_8")
 (eq_attr "fp" "yes"))
-  "xgene1_decode1op")
+  "xgene1_decode1op, xgene1_ld")

  (define_insn_reservation "xgene1_load1" 5
(and (eq_attr "tune" "xgene1")
-   (eq_attr "type" "load_4"))
-  "xgene1_decode1op")
+   (eq_attr "type" "load_4, load_8"))
+  "xgene1_decode1op, xgene1_ld")

-(define_insn_reservation "xgene1_store1" 2
+(

Re: [build, libgcc, libgo] Adapt Solaris 12 references

2017-11-15 Thread Rainer Orth
Hi Ian,

> On Tue, Nov 14, 2017 at 2:09 AM, Rainer Orth
>  wrote:
>>
>>> With the change in the Solaris release model (no more major releases
>>> like Solaris 12 but only minor ones like 11.4), the Solaris 12
>>> references in GCC need to be adapted.
>>>
>>> The following patch does this, consisting mostly of comment changes.
>>>
>>> Only a few changes bear comment:
>>>
>>> * Solaris 11.4 introduced __cxa_atexit, so we have to enable it on
>>>   *-*-solaris2.11.  Not a problem for native builds which check for the
>>>   actual availability of the function.
>>>
>>> * gcc.dg/torture/pr60092.c was xfailed on *-*-solaris2.11*, but the
>>>   underlying bug was fixed in Solaris 12/11.4.  However, now 11.3 and
>>>   11.4 have the same configure triplet.  To avoid noise on the newest
>>>   release, I've removed the xfail.
>>>
>>> I've left a few references to Solaris 12 builds in
>>> libstdc++-v3/acinclude.m4 because those hadn't been renamed
>>> retroactively, of course.
>>>
>>> install.texi needs some work, too, but I'll address this separately
>>> because there's more than just the version change.
>>>
>>> Bootstrapped without regressions on {i386-pc, sparc-sun}-solaris2.1[01]
>>> (both Solaris 11.3 and 11.4).  I believe I need approval only for the
>>> libgo parts.
>>
>> how should we proceed with the libgo part of this patch?  I can checkin
>> the rest (which contains functional changes) now and omit the libgo
>> part (either for now or completely, given that it consists only of
>> comment changes).
>
> Sorry, I've fallen behind a bit on gccgo/libgo patch review.  I've now
> committed your patch to libgo trunk.

thanks.  I've now committed the rest of the patch to mainline.

>>> I'm going to backport the patch to the gcc-7 and gcc-6 branches after a
>>> bit of soak time.
>>
>> What's the procedure for libgo here?  IIUC, only the trunk version of
>> libgo is imported from upstream, while changes to branches can go in
>> directly.
>
> That is correct.  Backporting gcc/go/gofrontend and libgo patches to
> release branches is always fine with me if the release managers are OK
> with it.

Good: I'll install the backports in a week or two unless the branches
are closed by then.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Eric Botcazou
> But we don't.  Wonder if in addition to your patch or instead of it it
> wouldn't be safer (especially for FEs added in the future) to:
> 
>/* If we see "return;" in some basic block, then we do reach the end
>   without returning a value.  */
> -  else if (warn_return_type
> +  else if (warn_return_type > 0
> && !TREE_NO_WARNING (fun->decl)
> && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0
> && !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fun->decl
> 
> in tree-cfg.c.  That change is preapproved if it works, and your
> patch if you want in addition to that is ok too.

That's the first thing I tried and it indeed works.

-- 
Eric Botcazou


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Martin Liška
On 11/15/2017 10:42 AM, Eric Botcazou wrote:
>> But we don't.  Wonder if in addition to your patch or instead of it it
>> wouldn't be safer (especially for FEs added in the future) to:
>>
>>/* If we see "return;" in some basic block, then we do reach the end
>>   without returning a value.  */
>> -  else if (warn_return_type
>> +  else if (warn_return_type > 0
>> && !TREE_NO_WARNING (fun->decl)
>> && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0
>> && !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fun->decl
>>
>> in tree-cfg.c.  That change is preapproved if it works, and your
>> patch if you want in addition to that is ok too.
> 
> That's the first thing I tried and it indeed works.
> 

Hi.

Following patch survives regression tests and bootstraps.
There are multiple places where warn_return_type should be compared
to zero.

Ready for trunk?
Thanks,
Martin
>From 5f8daccc584c7ae749d25d59526e0173aa4334f7 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 09:16:23 +0100
Subject: [PATCH] Disable -Wreturn-type by default in all languages other from
 C++.

gcc/ChangeLog:

2017-11-15  Martin Liska  

	* tree-cfg.c (pass_warn_function_return::execute):
	Compare warn_return_type for greater than zero.

gcc/ada/ChangeLog:

2017-11-15  Martin Liska  

	* gcc-interface/misc.c (gnat_post_options):
	Do not set default value of warn_return_type.

gcc/c/ChangeLog:

2017-11-15  Martin Liska  

	* c-decl.c (grokdeclarator):
	Compare warn_return_type for greater than zero.
	(start_function): Likewise.
	(finish_function): Likewise.
	* c-typeck.c (c_finish_return): Likewise.

gcc/cp/ChangeLog:

2017-11-15  Martin Liska  

	* decl.c (finish_function):
	Compare warn_return_type for greater than zero.
	* semantics.c (finish_return_stmt): Likewise.

gcc/fortran/ChangeLog:

2017-11-15  Martin Liska  

	* options.c (gfc_post_options):
	Do not set default value of warn_return_type.
	* trans-decl.c (gfc_trans_deferred_vars):
	Compare warn_return_type for greater than zero.
	(generate_local_decl): Likewise
	(gfc_generate_function_code): Likewise.
---
 gcc/ada/gcc-interface/misc.c | 3 ---
 gcc/c/c-decl.c   | 6 +++---
 gcc/c/c-typeck.c | 2 +-
 gcc/cp/decl.c| 2 +-
 gcc/cp/semantics.c   | 2 +-
 gcc/fortran/options.c| 3 ---
 gcc/fortran/trans-decl.c | 8 
 gcc/tree-cfg.c   | 2 +-
 8 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/gcc/ada/gcc-interface/misc.c b/gcc/ada/gcc-interface/misc.c
index 9a4a48fba42..7bdb3803c13 100644
--- a/gcc/ada/gcc-interface/misc.c
+++ b/gcc/ada/gcc-interface/misc.c
@@ -262,9 +262,6 @@ gnat_post_options (const char **pfilename ATTRIBUTE_UNUSED)
   /* No psABI change warnings for Ada.  */
   warn_psabi = 0;
 
-  /* No return type warnings for Ada.  */
-  warn_return_type = 0;
-
   /* No caret by default for Ada.  */
   if (!global_options_set.x_flag_diagnostics_show_caret)
 global_dc->show_caret = false;
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index d95a2b6ea4f..7120420f2df 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5689,7 +5689,7 @@ grokdeclarator (const struct c_declarator *declarator,
   /* Issue a warning if this is an ISO C 99 program or if
 	 -Wreturn-type and this is a function, or if -Wimplicit;
 	 prefer the former warning since it is more explicit.  */
-  if ((warn_implicit_int || warn_return_type || flag_isoc99)
+  if ((warn_implicit_int || warn_return_type > 0 || flag_isoc99)
 	  && funcdef_flag)
 	warn_about_return_type = 1;
   else
@@ -8655,7 +8655,7 @@ start_function (struct c_declspecs *declspecs, struct c_declarator *declarator,
 
   if (warn_about_return_type)
 warn_defaults_to (loc, flag_isoc99 ? OPT_Wimplicit_int
-			   : (warn_return_type ? OPT_Wreturn_type
+			   : (warn_return_type > 0 ? OPT_Wreturn_type
 			  : OPT_Wimplicit_int),
 		  "return type defaults to %");
 
@@ -9373,7 +9373,7 @@ finish_function (void)
   finish_fname_decls ();
 
   /* Complain if there's just no return statement.  */
-  if (warn_return_type
+  if (warn_return_type > 0
   && TREE_CODE (TREE_TYPE (TREE_TYPE (fndecl))) != VOID_TYPE
   && !current_function_returns_value && !current_function_returns_null
   /* Don't complain if we are no-return.  */
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 4bdc48a9ea3..492a245d296 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -10091,7 +10091,7 @@ c_finish_return (location_t loc, tree retval, tree origtype)
   if (!retval)
 {
   current_function_returns_null = 1;
-  if ((warn_return_type || flag_isoc99)
+  if ((warn_return_type > 0 || flag_isoc99)
 	  && valtype != NULL_TREE && TREE_CODE (valtype) != VOID_TYPE)
 	{
 	  bool warned_here;
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 041893db937..96bbff6c1f9 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -15583,7 +15583,7 @@ finish_function (bool inline_p)
 save_function_data (fndecl)

Re: [AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-15 Thread Kyrill Tkachov

Hi Kugan,

On 07/11/17 04:10, Kugan Vivekanandarajah wrote:

Hi,

Attached patch implements the  vld1_*_x2 intrinsics as defined by the
neon document.

Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is
this OK for trunk if no regressions?



This looks mostly ok to me (though I cannot approve) modulo a couple of 
minor type issues below.


Thanks,
Kyrill


Thanks,
Kugan

gcc/ChangeLog:

2017-11-06  Kugan Vivekanandarajah 

* config/aarch64/aarch64-simd.md (aarch64_ld1x2): New.
(aarch64_ld1x2): Likewise.
(aarch64_simd_ld1_x2): Likewise.
(aarch64_simd_ld1_x2): Likewise.
* config/aarch64/arm_neon.h (vld1_u8_x2): New.
(vld1_s8_x2): Likewise.
(vld1_u16_x2): Likewise.
(vld1_s16_x2): Likewise.
(vld1_u32_x2): Likewise.
(vld1_s32_x2): Likewise.
(vld1_u64_x2): Likewise.
(vld1_s64_x2): Likewise.
(vld1_f16_x2): Likewise.
(vld1_f32_x2): Likewise.
(vld1_f64_x2): Likewise.
(vld1_p8_x2): Likewise.
(vld1_p16_x2): Likewise.
(vld1_p64_x2): Likewise.
(vld1q_u8_x2): Likewise.
(vld1q_s8_x2): Likewise.
(vld1q_u16_x2): Likewise.
(vld1q_s16_x2): Likewise.
(vld1q_u32_x2): Likewise.
(vld1q_s32_x2): Likewise.
(vld1q_u64_x2): Likewise.
(vld1q_s64_x2): Likewise.
(vld1q_f16_x2): Likewise.
(vld1q_f32_x2): Likewise.
(vld1q_f64_x2): Likewise.
(vld1q_p8_x2): Likewise.
(vld1q_p16_x2): Likewise.
(vld1q_p64_x2): Likewise.

gcc/testsuite/ChangeLog:

2017-11-06  Kugan Vivekanandarajah 

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test.


+__extension__ extern __inline int8x8x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vld1_s8_x2 (const uint8_t *__a)

This should be "const int8_t *"

 +{
+  int8x8x2_t ret;
+  __builtin_aarch64_simd_oi __o;
+  __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) __a);
+  ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0);
+  ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1);
+  return ret;
+}

...

+__extension__ extern __inline int32x2x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vld1_s32_x2 (const uint32_t *__a)

Likewise, this should be "const int32_t *"

+{
+  int32x2x2_t ret;
+  __builtin_aarch64_simd_oi __o;
+  __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) __a);
+  ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
+  ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
+  return ret;
+}
+




Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Eric Botcazou
> Following patch survives regression tests and bootstraps.

Please drop the Ada bits though, -Wreturn-type just doesn't work in Ada.

-- 
Eric Botcazou


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Jakub Jelinek
On Wed, Nov 15, 2017 at 10:54:23AM +0100, Martin Liška wrote:
> gcc/c/ChangeLog:
> 
> 2017-11-15  Martin Liska  
> 
>   * c-decl.c (grokdeclarator):
>   Compare warn_return_type for greater than zero.
>   (start_function): Likewise.
>   (finish_function): Likewise.
>   * c-typeck.c (c_finish_return): Likewise.
> 
> gcc/cp/ChangeLog:
> 
> 2017-11-15  Martin Liska  
> 
>   * decl.c (finish_function):
>   Compare warn_return_type for greater than zero.
>   * semantics.c (finish_return_stmt): Likewise.

The c/cp changes aren't really needed, are they?  Because
in that case you guarantee in the post options handling it is
0 or 1.

The rest looks good (except for Ada that Eric doesn't want to change).

Jakub


Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and "+" attributes [Patch (2/3)]

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 10/11/17 10:56, Tamar Christina wrote:

Hi Sandra,

I've respun the patch with the docs changes you requested.

Regards,
Tamar

> -Original Message-
> From: Sandra Loosemore [mailto:san...@codesourcery.com]
> Sent: 07 November 2017 03:38
> To: Tamar Christina; gcc-patches@gcc.gnu.org
> Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com; Kyrylo
> Tkachov
> Subject: Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and
> "+" attributes [Patch (2/3)]
>
> On 11/06/2017 09:50 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch adds support for the setting the architecture and
> > extensions using the target GCC pragma.
> >
> > #pragma GCC target ("arch=armv8-a+crc")
> >
> > It also supports a short hand where an extension is just added to the
> > current architecture without changing it
> >
> > #pragma GCC target ("+crc")
> >
> > Popping and pushing options also correctly reconfigure the global
> > state as expected.
> >
> > Also supported is using the __attribute__((target("..."))) attributes
> > on functions to change the architecture or extension.
> >
> > Regtested on arm-none-eabi and no regressions.


This will need a bootstrap and test run on arm-none-linux-gnueabihf 
(like all arm changes).
Your changelog at 
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00387.html mentions

some arm-c.c changes but I don't see any included in this patch?

The other changes look good and in line with what I would expect, but 
can you please

post the arm-c.c changes if there are any?

Thanks,
Kyrill


> >
> > Ok for trunk?
> >
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index
> >
> 8aa443f87fb700f7a723d736bdbd53b6c839656d..18d0ffa6820326ce7badf33001
> b1
> > c6a467c95883 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -3858,6 +3858,42 @@ Specifies the fpu for which to tune the
> performance of this function.
> >  The behavior and permissible arguments are the same as for the
> > @option{-mfpu=}  command-line option.
> >
> > +@item arch=
> > +@cindex @code{arch=} function attribute, ARM Specifies the
> > +architecture version and architectural extensions to use for this
> > +function.  The behavior and permissible arguments are the same as for
> > +the @option{-march=} command-line option.
> > +
> > +The above target attributes can be specified as follows:
> > +
> > +@smallexample
> > +__attribute__((target("@var{attr-string}")))
> > +int
> > +f (int a)
> > +@{
> > +  return a + 5;
> > +@}
> > +@end smallexample
> > +
> > +where @code{@var{attr-string}} is one of the attribute strings.
>
> This example doesn't illustrate anything useful, and in fact just 
confuses
> things by introducing @var{attr-string}.  Please use an actual valid 
attribute

> here, something like "arch=armv8-a" or whatever.
>
> Also, either kill the sentence fragment after the example, or be 
careful to

> add @noindent before it to indicate it's a continuation of the previous
> paragraph.
>
> > +
> > +Additionally, the architectural extension string may be specified on
> > +its own.  This can be used to turn on and off particular
> > +architectural extensions without having to specify a particular 
architecture

> version or core.  Example:
> > +
> > +@smallexample
> > +__attribute__((target("+crc+nocrypto")))
> > +int
> > +foo (int a)
> > +@{
> > +  return a + 5;
> > +@}
> > +@end smallexample
> > +
> > +In this example @code{target("+crc+nocrypto")} enables the @code{crc}
> > +extension and disables the @code{crypto} extension for the function
> > +@code{foo} without modifying an existing @option{-march=} or
> @option{-mcpu} option.
> > +
> >  @end table
> >
> >  @end table
>
> -Sandra




Re: [RFA][PATCH] patch 4/n Refactor bits of vrp_visit_assignment_or_call

2017-11-15 Thread Kyrill Tkachov

Hi Jeff,

I think you attached the wrong patch to this mail...

Kyrill

On 15/11/17 06:32, Jeff Law wrote:


So the next group of changes is focused on breaking down evrp into an
analysis engine and the actual optimization pass.  The analysis engine
can be embedded into other dom walker passes quite easily. I've done it
for the sprintf warnings as well as the main DOM optimizer locally.

Separating analysis from optimization for edge ranges and PHI ranges is
easy.  Doing so for statements isn't terribly hard either, but does
require a tiny bit of refactoring elsewhere.

Which brings us to this patch.

If we look at evrp_dom_walker::before_dom_children we'll see this in the
statement processing code:

  else if (stmt_interesting_for_vrp (stmt))
{
  edge taken_edge;
  value_range vr = VR_INITIALIZER;
  extract_range_from_stmt (stmt, &taken_edge, &output, &vr);
  if (output
  && (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE))
{
  update_value_range (output, &vr);
  vr = *get_value_range (output);

  /* Mark stmts whose output we fully propagate for 
removal.  */


etc.

Conceptually this fragment is part of the analysis side. But the
subsequent code (optimization side) wants to know the "output" of the
statement.  I'm not keen on calling extract_range_from_stmt on both the
analysis side and the optimization side.

So this patch factors out a bit of extract_range_from_stmt and its child
vrp_visit_assignment_or_call into a routine that will return the proper
SSA_NAME.  So the analysis side calls extract_range_from_stmt and the
optimization side calls the new "get_output_for_vrp".

And of course to avoid duplication we use get_output_for_vrp from within
vrp_visit_assignment_or_call.


Bootstrapped and regression tested on x86_64.  OK for the trunk?

Jeff




Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

2017-11-15 Thread Richard Biener
On Tue, Nov 14, 2017 at 6:45 PM, Martin Sebor  wrote:
> On 11/14/2017 05:28 AM, Richard Biener wrote:
>>
>> On Mon, Nov 13, 2017 at 6:37 PM, Martin Sebor  wrote:
>>>
>>> Richard, this thread may have been conflated with the one Re:
>>> [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets
>>> (PR 82455) They are about different things.
>>>
>>> I'm still looking for approval of:
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01208.html
>
>
> Sorry, I pointed to an outdated version.  This is the latest
> version:
>
>   https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html
>
> My bad...
>
>
>>
>> +  tree maxbound
>> + = build_int_cst (sizetype, ~(1LLU << (TYPE_PRECISION (sizetype) - 1)));
>>
>> this looks possibly bogus.  Can you instead use
>>
>>   up_bound_p1
>> = wide_int_to_tree (sizetype, wi::div_trunc (wi::max_value
>> (TYPE_PRECISION (sizetype), SIGNED), wi::to_wide (eltsize)));
>>
>> please?  Note you are _not_ computing the proper upper bound here because
>> that
>> is what you compute plus low_bound.
>>
>> +  up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound, eltsize);
>>
>> +
>> +  tree arg = TREE_OPERAND (ref, 0);
>> +  tree_code code = TREE_CODE (arg);
>> +  if (code == COMPONENT_REF)
>> + {
>> +  HOST_WIDE_INT off;
>> +  if (tree base = get_addr_base_and_unit_offset (ref, &off))
>> +{
>> +  tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
>> +  if (TREE_CODE (size) == INTEGER_CST)
>> + up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1, size);
>>
>> I think I asked this multiple times now but given 'ref' is the
>> variable array-ref
>> a.b.c[i] when you call get_addr_base_and_unit_offset (ref, &off) you
>> always
>> get a NULL_TREE return value.
>>
>> So I asked you to pass it 'arg' instead ... which gets you the offset of
>> a.b.c, which looks like what you intended to get anyway.
>>
>> I also wonder what you compute here - you are looking at the size of
>> 'base'
>> but that is the size of 'a'.  You don't even use the computed offset!
>> Which
>> means you could have used get_base_address instead!?  Also the type
>> of 'base' may be completely off given MEM[&blk + 8].b.c[i] would return
>> blk
>> as base which might be an array of chars and not in any way related to
>> the type of the innermost structure we access with COMPONENT_REFs.
>>
>> Why are you only looking at COMPONENT_REF args anyways?  You
>> don't want to handle a.b[3][i]?
>>
>> That is, I'd have expected you do
>>
>>if (get_addr_base_and_unit_offset (ref, &off))
>>  up_bound_p1 = wide_int_to_tree (sizetype, wi::sub (wi::to_wide
>> (up_bound_p1), off));

^

Richard.

>> Richard.
>>
>>> Thanks
>>> Martin
>>>
>>>
> The difficulty with a testcase like
>
> struct { struct A { int b[1]; } a[5]; } x;
>
>  x.a[i].b[j]
>
> is that b is not considered an array at struct end since one of my
> recent changes to array_at_struct_end (basically it disallows having
> a flex array as the last member of an array).
>
> It would still stand for non-array components with variable offset
> but you can't create C testcases for that.
>
> So yes, for the specific case within the array_at_struct_end_p
> condition
> get_addr_base_and_unit_offset is enough.  IIRC the conditon was
> a bit more than just get_addr_base_and_unit_offset.  up_bound !=
> INTEGER_CST for example.  So make the above
>
> void foo (int n, int i)
> {
>  struct { struct A { int b[n]; } a[5]; } x;
>  return x.a[i].b[PTRDIFF_MAX/2];
> }
>
> with appropriately adjusted constant.  Does that give you the testcase
> you want?



 Thank you for the test case.  It is diagnosed the same way
 irrespective of which of the two functions is used so it serves
 to confirm my understanding that the only difference between
 the two functions is bits vs bytes.

 Unless you have another test case that does demonstrate that
 get_ref_base_and_extent is necessary/helpful, is the last patch
 okay to commit?

 (Again, to be clear, I'm happy to change or enhance the patch if
 I can verify that the change handles cases that the current patch
 misses.)

>
> As of "it works, catches corner-cases, ..." - yes, it does, but it
> adds code that needs to be maintained, may contain bugs, is
> executed even for valid code.



 Understood.  I don't claim the enhancement is free of any cost
 whatsoever.  But it is teeny by most standards and it doesn't
 detect just excessively large indices but also negative indices
 into last member arrays (bug 68325) and out-of-bounds indices
 (bug 82583).  The "excessively large" part does come largely
 for free with the other checks.

 Martin
>>>
>>>
>>>
>


Re: [PATCH, rs6000] (v2) GIMPLE folding for vector compares

2017-11-15 Thread Richard Biener
On Tue, Nov 14, 2017 at 11:11 PM, Will Schmidt
 wrote:
>
> Hi,
>   Add support for gimple folding of vec_cmp_{eq,ge,gt,le,ne}
> for the integer data types.
>
> As part of this change, several define_insn stanzas have been added/updated
> in vsx.md that specify the "ne: -> not: + eq: " combinations to allow for the 
> generation
> of the desired vcmpne[bhw] instructions, where we otherwise
> would have generated a vcmpeq + vnor combination.  The defines
> also obsoleted the need for the UNSPEC versions of the same, so this ends up
> being just an update to those existing defines.
>
> Several entries have been added to the switch statement in
> builtin_function_type to identify the builtins having unsigned arguments.
>
> A handful of existing tests required updates to their specified optimization
> levels to continue to generate the desired code.  builtins-3-p9.c in 
> particular
> has been updated to reflect improved code gen with the higher specified
> optimization level.
> Testcase coverage is otherwise handled by the already-in-tree
> gcc.target/powerpc/fold-vec-cmp-*.c tests.
>
> Per feedback from the prior version, v2 changes also include:
>   * Reworked the actual folding to use a VEC_COND_EXPR.  For cleanliness, I
>   moved this to a new fold_build_vec_cmp() helper function, which itself
>   is based on build_vec_cmp() as found in typeck.c.
>   * Added an additional fold_compare_helper() function to further factor out
>   the steps that are common to all of the vector compare operations.
>
> Testing is currently underway on P6 and newer. OK for trunk?

The folding part looks good to me.

Richard.

> Thanks,
> -Will
>
>
> 2017-11-14  Will Schmidt  
> [gcc]
> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
> folding of vector compares.
> (fold_build_vec_cmp): New helper function.
> (fold_compare_helper): New helper function.
> (builtin_function_type): Add compare builtins to the list of functions
> having unsigned arguments.
> * config/rs6000/vsx.md (vcmpneb, vcmpneh, vcmpnew): Update to specify
> the not+eq combination.
>
> [testsuite]
> * gcc.target/powerpc/builtins-3-p9.c: Add -O1, update
> expected codegen checks.
> * gcc.target/powerpc/vec-cmp-sel.c: Mark vars as volatile.
> * gcc.target/powerpc/vsu/vec-cmpne-0.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-1.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-2.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-3.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-4.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-5.c: Add -O1.
> * gcc.target/powerpc/vsu/vec-cmpne-6.c: Add -O1.
>
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 2c80a2f..0317324 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16206,10 +16206,40 @@ rs6000_builtin_valid_without_lhs (enum 
> rs6000_builtins fn_code)
>  default:
>return false;
>  }
>  }
>
> +/*  Helper function to handle the gimple folding of a vector compare
> +operation.  This sets up true/false vectors, and uses the
> +VEC_COND_EXPR operation.
> +'code' indicates which comparison is to be made. (EQ, GT, ...).
> +'type' indicates the type of the result.  */
> +static tree
> +fold_build_vec_cmp (tree_code code, tree type,
> +   tree arg0, tree arg1)
> +{
> +  tree cmp_type = build_same_sized_truth_vector_type (type);
> +  tree zero_vec = build_zero_cst (type);
> +  tree minus_one_vec = build_minus_one_cst (type);
> +  tree cmp = fold_build2 (code, cmp_type, arg0, arg1);
> +  return fold_build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
> +}
> +
> +/* Helper function to handle the in-between steps for the
> +   vector compare built-ins.  */
> +static void
> +fold_compare_helper (gimple_stmt_iterator *gsi, tree_code code, gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = gimple_call_arg (stmt, 1);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g = gimple_build_assign (lhs,
> +   fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1));
> +  gimple_set_location (g, gimple_location (stmt));
> +  gsi_replace (gsi, g, true);
> +}
> +
>  /* Fold a machine-dependent built-in in GIMPLE.  (For folding into
> a constant, use rs6000_fold_builtin.)  */
>
>  bool
>  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> @@ -16701,10 +16731,67 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
>}
>
> +/* Vector compares; EQ, NE, GE, GT, LE.  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:
> +case ALTIVEC_BUILTIN_VCMPEQUH:
> +case ALTIVEC_BUILTIN_VCMPEQUW:
> +case P8V_BUILTIN_VCMPEQUD:
> +  {
> +   fold_compare_helper (gsi, EQ_EXPR, stmt);
> +   

Re: [PATCH][GCC][ARM] Add Armv8.3-a to AArch32.

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 14/11/17 15:54, Tamar Christina wrote:

Hi All,

This patch adds Armv8.3-a as an architecture to the compiler with
the feature set inherited from Armv8.2-a.

Bootstrapped regtested on arm-none-linux-gnueabihf and no issues.



This is ok with a couple of ChangeLog nits.


gcc/
2017-11-14  Tamar Christina  

* config/arm/arm-cpus.in (armv8_3, ARMv8_3a, armv8.3-a): New
* config/arm/arm-tables.opt (armv8.3-a): New.


The convention is to say "Regenerated" for the whole file as it is not 
manually updated.



* doc/invoke.texi (ARM Options): Add armv8.3-a


Full stop at the end of sentence.

Thanks,
Kyrill

P.S. Can you please create an entry for this in the changes.html page [1]?
You can have a look at other similar entries for previous GCC releases 
for the format [2]


[1] https://gcc.gnu.org/about.html
[2] https://gcc.gnu.org/gcc-7/changes.html



Ok for trunk?

Thanks,
Tamar.

--




Re: [PATCH][GCC][ARM] Restrict TARGET_DOTPROD to baseline Armv8.2-a.

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 14/11/17 15:53, Tamar Christina wrote:

Hi All,

Dot Product is intended to only be available for Armv8.2-a and newer.
While this restriction is reflected in the intrinsics, the patterns
themselves were missing the Armv8.2-a bit.

While GCC would prevent invalid options e.g. `-march=armv8.1-a+dotprod`
we should prevent the pattern from being able to expand at all.

Regtested on arm-none-eabi and no issues.

Ok for trunk?



Ok.
Thanks,
Kyrill


Thanks,
Tamar

gcc/
2017-11-14  Tamar Christina  

* config/arm/arm.h (TARGET_DOTPROD): Add arm_arch8_2.

--




Re: [PATCH][GCC][ARM][AArch64] Testsuite framework changes and execution tests [Patch (8/8)]

2017-11-15 Thread Kyrill Tkachov

Hi Tamar,

On 06/10/17 13:45, Tamar Christina wrote:

Hi All,

this is a minor respin of the patch with the comments addressed. Note 
this patch is now 7/8 in the series.



Regtested on arm-none-eabi, armeb-none-eabi,
aarch64-none-elf and aarch64_be-none-elf with no issues found.

Ok for trunk?



This looks ok to me from an arm perspective.

Kyrill


gcc/testsuite
2017-10-06  Tamar Christina  

* lib/target-supports.exp
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): New.
(check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
(add_options_for_arm_v8_2a_dotprod_neon): New.
(check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
(check_effective_target_vect_sdot_qi): New.
(check_effective_target_vect_udot_qi): New.
* gcc.target/arm/simd/vdot-exec.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
* gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.

From: Tamar Christina
Sent: Monday, September 4, 2017 2:01:40 PM
To: Christophe Lyon
Cc: gcc-patches@gcc.gnu.org; nd; James Greenhalgh; Richard Earnshaw; 
Marcus Shawcroft
Subject: RE: [PATCH][GCC][ARM][AArch64] Testsuite framework changes 
and execution tests [Patch (8/8)]


Hi Christophe,

> >
> > gcc/testsuite
> > 2017-09-01  Tamar Christina 
> >
> > * lib/target-supports.exp
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
> New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
> > (add_options_for_arm_v8_2a_dotprod_neon): New.
> > (check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
> > (check_effective_target_vect_sdot_qi): New.
> > (check_effective_target_vect_udot_qi): New.
> > * gcc.target/arm/simd/vdot-exec.c: New.
>
> Aren't you defining twice P() and ARR() in vdot-exec.c ?
> I'd expect a preprocessor error, did I read too quickly?
>

Yes they are defined twice but they're not redefined, all the definitions
are exactly the same so the pre-processor doesn't care. I can leave only
one if this is confusing.

>
> Thanks,
>
> Christophe
>
> > * gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
> > * gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.
> >
> > --




Re: Add __builtin_tgmath for better tgmath.h implementation (bug 81156)

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 2:54 AM, Joseph Myers  wrote:
> Various implementations of C99/C11  have the property that
> their macro expansions contain many copies of the macro arguments, so
> resulting in exponential blowup of the size of macro expansions where
> a call to such a macro contains other such calls in the macro
> arguments.
>
> This patch adds a (C-only) language feature __builtin_tgmath designed
> to avoid this problem by implementing the  function
> selection rules directly in the compiler.  The effect is that
> type-generic macros can be defined simply as
>
> #define pow(a, b) __builtin_tgmath (powf, pow, powl, \
> cpowf, cpow, cpowl, a, b)
>
> as in the example added to the manual, with each macro argument
> expanded exactly once.  The details of __builtin_tgmath are as
> described in the manual.  This is C-only since C++ uses function
> overloading and just defines  to include  and
> .
>
> __builtin_tgmath handles C99/C11 type-generic macros, and _FloatN,
> _FloatNx and decimal floating-point types (following the proposed
> resolution to the floating-point TS DR#9 that makes the rules for
> finding a common type from arguments to a type-generic macro follow
> the usual arithmetic conversions after adjustment of integer arguments
> to _Decimal64 or double - or to _Complex double in the case of GNU
> complex integer arguments).
>
> Type-generic macros for functions from TS 18661 that round their
> results to a narrower type are handled, but there are still some
> unresolved questions regarding such macros so further changes in that
> regard may be needed in future.  The current implementation follows an
> older version of the DR#13 resolution (allowing a function for a
> wide-enough argument type to be selected if no exactly-matching
> function is available), but with appropriate calls to __builtin_tgmath
> is still fully compatible with the latest version of the resolution
> (not yet in the DR log), and allowing such not-exactly-matching
> argument types to be chosen in that case avoids needing another
> special case to treat integers as _Float64 instead of double in
> certain cases.
>
> Regarding other possible language/library features, not currently
> implemented in GCC:
>
> * Imaginary types could be naturally supported by allowing cases where
>   the type-generic type is an imaginary type T and arguments or return
>   types may be T (as at present), or the corresponding real type to T
>   (as at present), or (new) the corresponding real type if T is real
>   or imaginary but T if T is complex.  (tgmath.h would need a series
>   of functions such as
>
>   static inline _Imaginary double
>   __sin_imag (_Imaginary double __x)
>   {
> return _Imaginary_I * sinh (__imag__ __x);
>   }
>
>   to be used in __builtin_tgmath calls.)
>
> * __builtin_tgmath would use the constant rounding direction in the
>   presence of support for the FENV_ROUND / FENV_DEC_ROUND pragmas.
>   Support for those would also require a new __builtin_ to
>   cause a non-type-generic call to use the constant rounding
>   direction (it seems cleaner to add a new __builtin_ when
>   required than to make __builtin_tgmath handle a non-type-generic
>   case with only one function argument).
>
> * TS 18661-5 __STDC_TGMATH_OPERATOR_EVALUATION__ would require new
>   __builtin_ that evaluates with excess range and precision
>   like arithmetic operators do.
>
> * The proposed C bindings for IEEE 754-2018 augmented arithmetic
>   operations involve struct return types.  As currently implemented
>   __builtin_tgmath does not handle those, but support could be added.
>
> There are many error cases that the implementation diagnoses.  I've
> tried to ensure reasonable error messages for erroneous uses of
> __builtin_tgmath, but the errors for erroneous uses of the resulting
> type-generic macros (that is, when the non-function arguments have
> inappropriate types) are more important as they are more likely to be
> seen by users.
>
> GCC's own tgmath.h, as used for some targets, is updated in this
> patch.  I've tested those changes minimally, via adjusting
> gcc.dg/c99-tgmath-* locally to use that tgmath.h version.  I've also
> run the glibc testsuite (which has much more thorough tests of
> correctness of tgmath.h function selection) with a glibc patch to use
> __builtin_tgmath in glibc's tgmath.h.
>
> Bootstrapped with no regressions on x86_64-pc-linux-gnu.  Applied to
> mainline.

Thanks - I suppose we can't avoid the repeated expansion by sth like

#define exp(Val) ({ __typeof__ Val tem = Val; __TGMATH_UNARY_REAL_IMAG
(tem, exp, cexp); })

?

Richard.

> gcc:
> 2017-11-15  Joseph Myers  
>
> PR c/81156
> * doc/extend.texi (Other Builtins): Document __builtin_tgmath.
> * ginclude/tgmath.h (__tg_cplx, __tg_ldbl, __tg_dbl, __tg_choose)
> (__tg_choose_2, __tg_choose_3, __TGMATH_REAL_1_2)
> (__TGMATH_REAL_2_3): Remove macros.
> (__TGMATH_CPLX, __T

Re: [PATCH 02/14] Support for adding and stripping location_t wrapper nodes

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 7:17 AM, Trevor Saunders  wrote:
> On Fri, Nov 10, 2017 at 04:45:17PM -0500, David Malcolm wrote:
>> This patch provides a mechanism in tree.c for adding a wrapper node
>> for expressing a location_t, for those nodes for which
>> !CAN_HAVE_LOCATION_P, along with a new method of cp_expr.
>>
>> It's called in later patches in the kit via that new method.
>>
>> In this version of the patch, I use NON_LVALUE_EXPR for wrapping
>> constants, and VIEW_CONVERT_EXPR for other nodes.
>>
>> I also turned off wrapper nodes for EXCEPTIONAL_CLASS_P, for the sake
>> of keeping the patch kit more minimal.
>>
>> The patch also adds a STRIP_ANY_LOCATION_WRAPPER macro for stripping
>> such nodes, used later on in the patch kit.
>
> I happened to start reading this series near the end and was rather
> confused by this macro since it changes variables in a rather unhygienic
> way.  Did you consider just defining a inline function to return the
> actual decl?  It seems like its not used that often so the slight extra
> syntax should be that big a deal compared to the explicitness.

Existing practice  (STRIP_NOPS & friends).  I'm fine either way,
the patch looks good.

Eventually you can simplify things by doing less checking in
location_wrapper_p, like only checking

+inline bool location_wrapper_p (const_tree exp)
+{
+  if ((TREE_CODE (exp) == NON_LVALUE_EXPR
+   || (TREE_CODE (exp) == VIEW_CONVERT_EXPR
+  && (TREE_TYPE (exp)
+ == TREE_TYPE (TREE_OPERAND (exp, 0)))
+return true;
+  return false;
+}

and renaming to maybe_location_wrapper_p.  After all you can't really
distinguish location wrappers from non-location wrappers?  (and why
would you want to?)

Thanks,
Richard.

> Other than that the series seems reasonable, and I look forward to
> having wrappers in more places.  I seem to remember something I wanted
> to warn about they would make much easier.
>
> Thanks
>
> Trev
>


Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-11-15 Thread Richard Biener
On Wed, Nov 15, 2017 at 10:11 AM, Pierre-Marie de Rodat
 wrote:
> Hello Richard,
>
> On 09/25/2017 01:54 PM, Richard Biener wrote:
>>
>> Ok for trunk and gcc-7 branch after a while.
>
> Is it still okay to commit to gcc-7, now?

Yes.

Richard.

> --
> Pierre-Marie de Rodat


Re: [AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-15 Thread James Greenhalgh
On Wed, Nov 15, 2017 at 09:58:28AM +, Kyrill Tkachov wrote:
> Hi Kugan,
> 
> On 07/11/17 04:10, Kugan Vivekanandarajah wrote:
> > Hi,
> >
> > Attached patch implements the  vld1_*_x2 intrinsics as defined by the
> > neon document.
> >
> > Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is
> > this OK for trunk if no regressions?
> >
> 
> This looks mostly ok to me (though I cannot approve) modulo a couple of 
> minor type issues below.

Thanks for the review Kyrill!

I'm happy to trust Kyrill's knowledge of the back-end here, so the patch
is OK with the changes Kyrill requested.

Thanks for the patch!

James

> > gcc/ChangeLog:
> >
> > 2017-11-06  Kugan Vivekanandarajah 
> >
> > * config/aarch64/aarch64-simd.md (aarch64_ld1x2): New.
> > (aarch64_ld1x2): Likewise.
> > (aarch64_simd_ld1_x2): Likewise.
> > (aarch64_simd_ld1_x2): Likewise.
> > * config/aarch64/arm_neon.h (vld1_u8_x2): New.
> > (vld1_s8_x2): Likewise.
> > (vld1_u16_x2): Likewise.
> > (vld1_s16_x2): Likewise.
> > (vld1_u32_x2): Likewise.
> > (vld1_s32_x2): Likewise.
> > (vld1_u64_x2): Likewise.
> > (vld1_s64_x2): Likewise.
> > (vld1_f16_x2): Likewise.
> > (vld1_f32_x2): Likewise.
> > (vld1_f64_x2): Likewise.
> > (vld1_p8_x2): Likewise.
> > (vld1_p16_x2): Likewise.
> > (vld1_p64_x2): Likewise.
> > (vld1q_u8_x2): Likewise.
> > (vld1q_s8_x2): Likewise.
> > (vld1q_u16_x2): Likewise.
> > (vld1q_s16_x2): Likewise.
> > (vld1q_u32_x2): Likewise.
> > (vld1q_s32_x2): Likewise.
> > (vld1q_u64_x2): Likewise.
> > (vld1q_s64_x2): Likewise.
> > (vld1q_f16_x2): Likewise.
> > (vld1q_f32_x2): Likewise.
> > (vld1q_f64_x2): Likewise.
> > (vld1q_p8_x2): Likewise.
> > (vld1q_p16_x2): Likewise.
> > (vld1q_p64_x2): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 2017-11-06  Kugan Vivekanandarajah 
> >
> > * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test.
> 
> +__extension__ extern __inline int8x8x2_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vld1_s8_x2 (const uint8_t *__a)
> 
> This should be "const int8_t *"
> 
>   +{
> +  int8x8x2_t ret;
> +  __builtin_aarch64_simd_oi __o;
> +  __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) 
> __a);
> +  ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0);
> +  ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1);
> +  return ret;
> +}
> 
> ...
> 
> +__extension__ extern __inline int32x2x2_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vld1_s32_x2 (const uint32_t *__a)
> 
> Likewise, this should be "const int32_t *"
> 
> +{
> +  int32x2x2_t ret;
> +  __builtin_aarch64_simd_oi __o;
> +  __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) 
> __a);
> +  ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0);
> +  ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1);
> +  return ret;
> +}
> +
> 
> 


RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Tamar Christina


> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Wednesday, November 15, 2017 08:24
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> i...@airs.com
> Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> supports unaligned access [Patch (1/2)]
> 
> On Tue, 14 Nov 2017, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch allows larger bitsizes to be used as copy size when the
> > target does not have SLOW_UNALIGNED_ACCESS.
> >
> > fun3:
> > adrpx2, .LANCHOR0
> > add x2, x2, :lo12:.LANCHOR0
> > mov x0, 0
> > sub sp, sp, #16
> > ldrhw1, [x2, 16]
> > ldrbw2, [x2, 18]
> > add sp, sp, 16
> > bfi x0, x1, 0, 8
> > ubfxx1, x1, 8, 8
> > bfi x0, x1, 8, 8
> > bfi x0, x2, 16, 8
> > ret
> >
> > is turned into
> >
> > fun3:
> > adrpx0, .LANCHOR0
> > add x0, x0, :lo12:.LANCHOR0
> > sub sp, sp, #16
> > ldrhw1, [x0, 16]
> > ldrbw0, [x0, 18]
> > strhw1, [sp, 8]
> > strbw0, [sp, 10]
> > ldr w0, [sp, 8]
> > add sp, sp, 16
> > ret
> >
> > which avoids the bfi's for a simple 3 byte struct copy.
> >
> > Regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and
> no regressions.
> >
> > This patch is just splitting off from the previous combined patch with
> > AArch64 and adding a testcase.
> >
> > I assume Jeff's ACK from
> > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still valid as
> the code did not change.
> 
> Given your no_slow_unalign isn't mode specific can't you use the existing
> non_strict_align?

No because non_strict_align checks if the target supports unaligned access at 
all,

This no_slow_unalign corresponds instead to the target slow_unaligned_access
which checks that the access you want to make has a greater cost than doing an
aligned access. ARM for instance always return 1 (value of STRICT_ALIGNMENT)
for slow_unaligned_access while for non_strict_align it may return 0 or 1 based
on the options provided to the compiler.

The problem is I have no way to test STRICT_ALIGNMENT or slow_unaligned_access
So I had to hardcode some targets that I know it does work on.

Thanks,
Tamar
> 
> Otherwise the expr.c change looks ok.
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> >
> >
> > gcc/
> > 2017-11-14  Tamar Christina  
> >
> > * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > with fast unaligned access.
> > * doc/sourcebuild.texi (no_slow_unalign): New.
> >
> > gcc/testsuite/
> > 2017-11-14  Tamar Christina  
> >
> > * gcc.dg/struct-simple.c: New.
> > * lib/target-supports.exp
> > (check_effective_target_no_slow_unalign): New.
> >
> >
> 
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)


Re: [Patch, fortran] PR78990 [5/6/7 Regression] ICE when assigning polymorphic array function result

2017-11-15 Thread Dominique d'Humières
Hi Paul,

Your patch fixes the ICE and pass the tests. However I see

At line 22 of file pr78990.f90
Fortran runtime error: Attempting to allocate already allocated variable 
‘return_t1'

for the original tests (with mold or source). This runtime error depends on the 
options:

% gfc pr78990.f90
% a.out
At line 22 of file pr78990.f90
Fortran runtime error: Attempting to allocate already allocated variable 
'return_t1'

Error termination. Backtrace:
…
% gfc pr78990.f90 -fno-backtrace
% a.out
   0   0   0
% gfc pr78990.f90 -m32
% a.out
   0   0   0
% gfc pr78990.f90 -O
% a.out
   0   0   0

The problem seems related to the line

  print*,v2%i

Cheers,

Dominique




[PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Tamar Christina
Hi All,

This patch updates the documentation for AArch64 and ARM correcting the use of 
the
architecture namings by adding the -A suffix in appropriate places.

Build done on aarch64-none-elf and arm-none-eabi and no issues.

Ok for trunk?

Thanks,
Tamar

gcc/
2017-11-15  Tamar Christina  

* doc/extend.texi: Add -A suffix (ARMv8*-A, ARMv7-A).
* doc/invoke.texi: Add -A suffix (ARMv8*-A, ARMv7-A).
* doc/sourcebuild.texi: Add -A suffix (ARMv8*-A, ARMv7-A).

-- 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 63b58c0681e856da7ecc8c57c5d2f43613389a1d..a7a1ffcb852749b4e39facb434b2feda3534e77b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1045,7 +1045,7 @@ expressions are automatically promoted to @code{float}.
 
 The ARM target provides hardware support for conversions between
 @code{__fp16} and @code{float} values
-as an extension to VFP and NEON (Advanced SIMD), and from ARMv8 provides
+as an extension to VFP and NEON (Advanced SIMD), and from ARMv8-A provides
 hardware support for conversions between @code{__fp16} and @code{double}
 values.  GCC generates code using these hardware instructions if you
 compile with options to select an FPU that provides them;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e897d93070ae320f741aeba4d2490f8366843935..b2f044cf5fb75c44a180b2231284882728248952 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15504,8 +15504,8 @@ entirely disabled by the @samp{+nofp} option that follows it.
 Most extension names are generically named, but have an effect that is
 dependent upon the architecture to which it is applied.  For example,
 the @samp{+simd} option can be applied to both @samp{armv7-a} and
-@samp{armv8-a} architectures, but will enable the original ARMv7
-Advanced SIMD (Neon) extensions for @samp{armv7-a} and the ARMv8-a
+@samp{armv8-a} architectures, but will enable the original ARMv7-A
+Advanced SIMD (Neon) extensions for @samp{armv7-a} and the ARMv8-A
 variant for @samp{armv8-a}.
 
 The table below lists the supported extensions for each architecture.
@@ -15646,7 +15646,7 @@ Disable the floating-point and Advanced SIMD instructions.
 @item +crc
 The Cyclic Redundancy Check (CRC) instructions.
 @item +simd
-The ARMv8 Advanced SIMD and floating-point instructions.
+The ARMv8-A Advanced SIMD and floating-point instructions.
 @item +crypto
 The cryptographic instructions.
 @item +nocrypto
@@ -15658,7 +15658,7 @@ Disable the floating-point, Advanced SIMD and cryptographic instructions.
 @item armv8.1-a
 @table @samp
 @item +simd
-The ARMv8.1 Advanced SIMD and floating-point instructions.
+The ARMv8.1-A Advanced SIMD and floating-point instructions.
 
 @item +crypto
 The cryptographic instructions.  This also enables the Advanced SIMD and
@@ -15678,7 +15678,7 @@ The half-precision floating-point data processing instructions.
 This also enables the Advanced SIMD and floating-point instructions.
 
 @item +simd
-The ARMv8.1 Advanced SIMD and floating-point instructions.
+The ARMv8.1-A Advanced SIMD and floating-point instructions.
 
 @item +crypto
 The cryptographic instructions.  This also enables the Advanced SIMD and
@@ -15754,7 +15754,7 @@ The Cyclic Redundancy Check (CRC) instructions.
 @item +fp.sp
 The single-precision FPv5 floating-point instructions.
 @item +simd
-The ARMv8 Advanced SIMD and floating-point instructions.
+The ARMv8-A Advanced SIMD and floating-point instructions.
 @item +crypto
 The cryptographic instructions.
 @item +nocrypto
@@ -16173,9 +16173,9 @@ Divided syntax should be considered deprecated.
 
 @item -mrestrict-it
 @opindex mrestrict-it
-Restricts generation of IT blocks to conform to the rules of ARMv8.
+Restricts generation of IT blocks to conform to the rules of ARMv8-A.
 IT blocks can only contain a single 16-bit instruction from a select
-set of instructions. This option is on by default for ARMv8 Thumb mode.
+set of instructions. This option is on by default for ARMv8-A Thumb mode.
 
 @item -mprint-tune-info
 @opindex mprint-tune-info
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index d5a90e518d67fb289c8caf2e8f2237970b6649ea..9bb14da1a6f6ec76de72a0927a17909c4d2f0ad5 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1714,11 +1714,11 @@ Some multilibs may be incompatible with these options.
 
 @item arm_v8_1a_neon_ok
 @anchor{arm_v8_1a_neon_ok}
-ARM target supports options to generate ARMv8.1 Adv.SIMD instructions.
+ARM target supports options to generate ARMv8.1-A Adv.SIMD instructions.
 Some multilibs may be incompatible with these options.
 
 @item arm_v8_1a_neon_hw
-ARM target supports executing ARMv8.1 Adv.SIMD instructions.  Some
+ARM target supports executing ARMv8.1-A Adv.SIMD instructions.  Some
 multilibs may be incompatible with the options needed.  Implies
 arm_v8_1a_neon_ok.
 
@@ -1727,34 +1727,34 @@ ARM target supports acquire-release instructions.
 
 @item arm_v8_2a_fp16_scalar_ok
 @anchor{arm_v8_2a_fp16_scalar

Re: [PATCH] [PR82155] Fix crash in dwarf2out_abstract_function

2017-11-15 Thread Pierre-Marie de Rodat

On 11/15/2017 12:16 PM, Richard Biener wrote:

Is it still okay to commit to gcc-7, now?


Yes.


Done. Thank you!

--
Pierre-Marie de Rodat


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-11-15 Thread Martin Liška
On 11/15/2017 11:04 AM, Jakub Jelinek wrote:
> On Wed, Nov 15, 2017 at 10:54:23AM +0100, Martin Liška wrote:
>> gcc/c/ChangeLog:
>>
>> 2017-11-15  Martin Liska  
>>
>>  * c-decl.c (grokdeclarator):
>>  Compare warn_return_type for greater than zero.
>>  (start_function): Likewise.
>>  (finish_function): Likewise.
>>  * c-typeck.c (c_finish_return): Likewise.
>>
>> gcc/cp/ChangeLog:
>>
>> 2017-11-15  Martin Liska  
>>
>>  * decl.c (finish_function):
>>  Compare warn_return_type for greater than zero.
>>  * semantics.c (finish_return_stmt): Likewise.
> 
> The c/cp changes aren't really needed, are they?  Because
> in that case you guarantee in the post options handling it is
> 0 or 1.

Yep, you're right!

> 
> The rest looks good (except for Ada that Eric doesn't want to change).
> 
>   Jakub
> 


Done that and I'm going to install the patch.

Martin
>From c0934d0be85d40762d4bafbf9991b167b711736e Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 09:16:23 +0100
Subject: [PATCH] Disable -Wreturn-type by default in all languages other from
 C++.

gcc/ChangeLog:

2017-11-15  Martin Liska  

	* tree-cfg.c (pass_warn_function_return::execute):
	Compare warn_return_type for greater than zero.

gcc/fortran/ChangeLog:

2017-11-15  Martin Liska  

	* options.c (gfc_post_options):
	Do not set default value of warn_return_type.
	* trans-decl.c (gfc_trans_deferred_vars):
	Compare warn_return_type for greater than zero.
	(generate_local_decl): Likewise
	(gfc_generate_function_code): Likewise.
---
 gcc/fortran/options.c| 3 ---
 gcc/fortran/trans-decl.c | 8 
 gcc/tree-cfg.c   | 2 +-
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index c584a19e559..0ee6b7808d9 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -435,9 +435,6 @@ gfc_post_options (const char **pfilename)
 gfc_fatal_error ("Maximum subrecord length cannot exceed %d",
 		 MAX_SUBRECORD_LENGTH);
 
-  if (warn_return_type == -1)
-warn_return_type = 0;
-
   gfc_cpp_post_options ();
 
   if (gfc_option.allow_std & GFC_STD_F2008)
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 8efaae79ebc..60e7d8f79ee 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -4198,7 +4198,7 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 		  break;
 	}
 	  /* TODO: move to the appropriate place in resolve.c.  */
-	  if (warn_return_type && el == NULL)
+	  if (warn_return_type > 0 && el == NULL)
 	gfc_warning (OPT_Wreturn_type,
 			 "Return value of function %qs at %L not set",
 			 proc_sym->name, &proc_sym->declared_at);
@@ -5619,7 +5619,7 @@ generate_local_decl (gfc_symbol * sym)
   else if (sym->attr.flavor == FL_PROCEDURE)
 {
   /* TODO: move to the appropriate place in resolve.c.  */
-  if (warn_return_type
+  if (warn_return_type > 0
 	  && sym->attr.function
 	  && sym->result
 	  && sym != sym->result
@@ -6494,11 +6494,11 @@ gfc_generate_function_code (gfc_namespace * ns)
   if (result == NULL_TREE || artificial_result_decl)
 	{
 	  /* TODO: move to the appropriate place in resolve.c.  */
-	  if (warn_return_type && sym == sym->result)
+	  if (warn_return_type > 0 && sym == sym->result)
 	gfc_warning (OPT_Wreturn_type,
 			 "Return value of function %qs at %L not set",
 			 sym->name, &sym->declared_at);
-	  if (warn_return_type)
+	  if (warn_return_type > 0)
 	TREE_NO_WARNING(sym->backend_decl) = 1;
 	}
   if (result != NULL_TREE)
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 9a2fa1d98ca..f08a0547f0f 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -9071,7 +9071,7 @@ pass_warn_function_return::execute (function *fun)
 
   /* If we see "return;" in some basic block, then we do reach the end
  without returning a value.  */
-  else if (warn_return_type
+  else if (warn_return_type > 0
 	   && !TREE_NO_WARNING (fun->decl)
 	   && EDGE_COUNT (EXIT_BLOCK_PTR_FOR_FN (fun)->preds) > 0
 	   && !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (fun->decl
-- 
2.14.3



[PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Liška
On 11/06/2017 07:29 PM, Martin Sebor wrote:
> Sorry for being late with my comment.  I just spotted this minor
> formatting issue.  Even though GCC isn't (yet) consistent about
> it the keyword "constexpr" should be quoted in the error message
> below (and, eventually, in all diagnostic messages).  Since the
> patch has been committed by now this is just a reminder for us
> to try to keep this in mind in the future.

Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"

Thanks,
Martin
>From eb554d8778be239a2edb06d21f98bda7e5153765 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 15 Nov 2017 08:41:12 +0100
Subject: [PATCH] Add quotes for constexpr keyword.

gcc/cp/ChangeLog:

2017-11-15  Martin Liska  

	* class.c (finalize_literal_type_property): Add quotes for
	constexpr keyword.
	(explain_non_literal_class): Likewise.
	* constexpr.c (ensure_literal_type_for_constexpr_object): Likewise.
	(is_valid_constexpr_fn): Likewise.
	(check_constexpr_ctor_body): Likewise.
	(register_constexpr_fundef): Likewise.
	(explain_invalid_constexpr_fn): Likewise.
	(cxx_eval_builtin_function_call): Likewise.
	(cxx_eval_call_expression): Likewise.
	(cxx_eval_loop_expr): Likewise.
	(potential_constant_expression_1): Likewise.
	* decl.c (check_previous_goto_1): Likewise.
	(check_goto): Likewise.
	(grokfndecl): Likewise.
	(grokdeclarator): Likewise.
	* error.c (maybe_print_constexpr_context): Likewise.
	* method.c (process_subob_fn): Likewise.
	(defaulted_late_check): Likewise.
	* parser.c (cp_parser_compound_statement): Likewise.
---
 gcc/cp/class.c |  4 ++--
 gcc/cp/constexpr.c | 35 ++-
 gcc/cp/decl.c  | 12 ++--
 gcc/cp/error.c |  4 ++--
 gcc/cp/method.c|  6 +++---
 gcc/cp/parser.c|  2 +-
 6 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 586a32c436f..529f37f24ee 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -5368,7 +5368,7 @@ finalize_literal_type_property (tree t)
 	  DECL_DECLARED_CONSTEXPR_P (fn) = false;
 	  if (!DECL_GENERATED_P (fn)
 	  && pedwarn (DECL_SOURCE_LOCATION (fn), OPT_Wpedantic,
-			  "enclosing class of constexpr non-static member "
+			  "enclosing class of % non-static member "
 			  "function %q+#D is not a literal type", fn))
 	explain_non_literal_class (t);
 	}
@@ -5406,7 +5406,7 @@ explain_non_literal_class (tree t)
 {
   inform (UNKNOWN_LOCATION,
 	  "  %q+T is not an aggregate, does not have a trivial "
-	  "default constructor, and has no constexpr constructor that "
+	  "default constructor, and has no % constructor that "
 	  "is not a copy or move constructor", t);
   if (type_has_non_user_provided_default_constructor (t))
 	/* Note that we can't simply call locate_ctor because when the
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index d6b6843e804..e0a4133d89b 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -94,8 +94,8 @@ ensure_literal_type_for_constexpr_object (tree decl)
 	{
 	  if (DECL_DECLARED_CONSTEXPR_P (decl))
 	{
-	  error ("the type %qT of constexpr variable %qD is not literal",
-		 type, decl);
+	  error ("the type %qT of % variable %qD "
+		 "is not literal", type, decl);
 	  explain_non_literal_class (type);
 	}
 	  else
@@ -177,7 +177,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 {
   ret = false;
   if (complain)
-	error ("inherited constructor %qD is not constexpr",
+	error ("inherited constructor %qD is not %",
 	   DECL_INHERITED_CTOR (fun));
 }
   else
@@ -189,7 +189,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	ret = false;
 	if (complain)
 	  {
-		error ("invalid type for parameter %d of constexpr "
+		error ("invalid type for parameter %d of % "
 		   "function %q+#D", DECL_PARM_INDEX (parm), fun);
 		explain_non_literal_class (TREE_TYPE (parm));
 	  }
@@ -201,7 +201,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
   ret = false;
   if (complain)
 	inform (DECL_SOURCE_LOCATION (fun),
-		"lambdas are implicitly constexpr only in C++17 and later");
+		"lambdas are implicitly % only in C++17 and later");
 }
   else if (!DECL_CONSTRUCTOR_P (fun))
 {
@@ -211,7 +211,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	  ret = false;
 	  if (complain)
 	{
-	  error ("invalid return type %qT of constexpr function %q+D",
+	  error ("invalid return type %qT of % function %q+D",
 		 rettype, fun);
 	  explain_non_literal_class (rettype);
 	}
@@ -225,7 +225,7 @@ is_valid_constexpr_fn (tree fun, bool complain)
 	  ret = false;
 	  if (complain
 	  && pedwarn (DECL_SOURCE_LOCATION (fun), OPT_Wpedantic,
-			  "enclosing class of constexpr non-static member "
+			  "enclosing class of % non-static member "
 			  "funct

[PATCH] Fix PR82985

2017-11-15 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to branch,
testcase also to trunk.

Richard.

2017-11-15  Richard Biener  

PR tree-optimization/82985
Backport from mainline
2017-08-15  Richard Biener  

PR tree-optimization/81790
* tree-ssa-sccvn.c (vn_lookup_simplify_result): Handle both
CONSTRUCTORs from simplifying and VN.

* gcc.dg/torture/pr81790.c: New testcase.
* g++.dg/torture/pr82985.C: Likewise.

Index: gcc/testsuite/gcc.dg/torture/pr81790.c
===
--- gcc/testsuite/gcc.dg/torture/pr81790.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr81790.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--param sccvn-max-scc-size=10" } */
+
+typedef int a __attribute__ ((__vector_size__ (16)));
+typedef struct
+{
+  a b;
+} c;
+
+int d, e;
+
+void foo (c *ptr);
+
+void bar ()
+{
+  double b = 1842.9028;
+  c g, h;
+  if (d)
+b = 77.7998;
+  for (; e;)
+{
+  g.b = g.b = g.b + g.b;
+  h.b = (a){b};
+  h.b = h.b + h.b;
+}
+  foo (&g);
+  foo (&h);
+}
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 254492)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -1643,13 +1643,25 @@ static vn_nary_op_t vn_nary_op_insert_st
 /* Hook for maybe_push_res_to_seq, lookup the expression in the VN tables.  */
 
 static tree
-vn_lookup_simplify_result (code_helper rcode, tree type, tree *ops)
+vn_lookup_simplify_result (code_helper rcode, tree type, tree *ops_)
 {
   if (!rcode.is_tree_code ())
 return NULL_TREE;
+  tree *ops = ops_;
+  unsigned int length = TREE_CODE_LENGTH ((tree_code) rcode);
+  if (rcode == CONSTRUCTOR
+  /* ???  We're arriving here with SCCVNs view, decomposed CONSTRUCTOR
+and GIMPLEs / match-and-simplifies, CONSTRUCTOR as GENERIC tree.  */
+  && TREE_CODE (ops_[0]) == CONSTRUCTOR)
+{
+  length = CONSTRUCTOR_NELTS (ops_[0]);
+  ops = XALLOCAVEC (tree, length);
+  for (unsigned i = 0; i < length; ++i)
+   ops[i] = CONSTRUCTOR_ELT (ops_[0], i)->value;
+}
   vn_nary_op_t vnresult = NULL;
-  return vn_nary_op_lookup_pieces (TREE_CODE_LENGTH ((tree_code) rcode),
-  (tree_code) rcode, type, ops, &vnresult);
+  return vn_nary_op_lookup_pieces (length, (tree_code) rcode,
+  type, ops, &vnresult);
 }
 
 /* Return a value-number for RCODE OPS... either by looking up an existing
Index: gcc/testsuite/g++.dg/torture/pr82985.C
===
--- gcc/testsuite/g++.dg/torture/pr82985.C  (nonexistent)
+++ gcc/testsuite/g++.dg/torture/pr82985.C  (working copy)
@@ -0,0 +1,458 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-w" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
+
+namespace std {
+template < typename _Default > struct __detector { using type = _Default; };
+template < typename _Default, template < typename > class >
+using __detected_or = __detector< _Default >;
+template < typename _Default, template < typename > class _Op >
+using __detected_or_t = typename __detected_or< _Default, _Op >::type;
+template < typename > struct iterator_traits;
+template < typename _Tp > struct iterator_traits< _Tp * > {
+  typedef _Tp reference;
+};
+} // std
+using std::iterator_traits;
+template < typename _Iterator, typename > struct __normal_iterator {
+  typename iterator_traits< _Iterator >::reference operator*();
+  void operator++();
+};
+template < typename _IteratorL, typename _IteratorR, typename _Container >
+int operator!=(__normal_iterator< _IteratorL, _Container >,
+   __normal_iterator< _IteratorR, _Container >);
+namespace std {
+template < typename _Tp > struct allocator { typedef _Tp value_type; };
+struct __allocator_traits_base {
+  template < typename _Tp > using __pointer = typename _Tp::pointer;
+};
+template < typename _Alloc > struct allocator_traits : __allocator_traits_base 
{
+  using pointer = __detected_or_t< typename _Alloc::value_type *, __pointer >;
+};
+} // std
+typedef double __m128d __attribute__((__vector_size__(16)));
+typedef double __m256d __attribute__((__vector_size__(32)));
+enum { InnerVectorizedTraversal, LinearVectorizedTraversal };
+enum { ReadOnlyAccessors };
+template < int, typename Then, typename > struct conditional {
+  typedef Then type;
+};
+template < typename Then, typename Else > struct conditional< 0, Then, Else > {
+  typedef Else type;
+};
+template < typename, typename > struct is_same {
+  enum { value };
+};
+template < typename T > struct is_same< T, T > {
+  enum { value = 1 };
+};
+template < typename > struct traits;
+struct accessors_level {
+  enum { has_direct_access, has_write_access, value };
+};
+template < typename > struct EigenBase;
+t

RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread Richard Biener
On Wed, 15 Nov 2017, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: Wednesday, November 15, 2017 08:24
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > i...@airs.com
> > Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> > supports unaligned access [Patch (1/2)]
> > 
> > On Tue, 14 Nov 2017, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch allows larger bitsizes to be used as copy size when the
> > > target does not have SLOW_UNALIGNED_ACCESS.
> > >
> > > fun3:
> > >   adrpx2, .LANCHOR0
> > >   add x2, x2, :lo12:.LANCHOR0
> > >   mov x0, 0
> > >   sub sp, sp, #16
> > >   ldrhw1, [x2, 16]
> > >   ldrbw2, [x2, 18]
> > >   add sp, sp, 16
> > >   bfi x0, x1, 0, 8
> > >   ubfxx1, x1, 8, 8
> > >   bfi x0, x1, 8, 8
> > >   bfi x0, x2, 16, 8
> > >   ret
> > >
> > > is turned into
> > >
> > > fun3:
> > >   adrpx0, .LANCHOR0
> > >   add x0, x0, :lo12:.LANCHOR0
> > >   sub sp, sp, #16
> > >   ldrhw1, [x0, 16]
> > >   ldrbw0, [x0, 18]
> > >   strhw1, [sp, 8]
> > >   strbw0, [sp, 10]
> > >   ldr w0, [sp, 8]
> > >   add sp, sp, 16
> > >   ret
> > >
> > > which avoids the bfi's for a simple 3 byte struct copy.
> > >
> > > Regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu and
> > no regressions.
> > >
> > > This patch is just splitting off from the previous combined patch with
> > > AArch64 and adding a testcase.
> > >
> > > I assume Jeff's ACK from
> > > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still valid as
> > the code did not change.
> > 
> > Given your no_slow_unalign isn't mode specific can't you use the existing
> > non_strict_align?
> 
> No because non_strict_align checks if the target supports unaligned access at 
> all,
> 
> This no_slow_unalign corresponds instead to the target slow_unaligned_access
> which checks that the access you want to make has a greater cost than doing an
> aligned access. ARM for instance always return 1 (value of STRICT_ALIGNMENT)
> for slow_unaligned_access while for non_strict_align it may return 0 or 1 
> based
> on the options provided to the compiler.
> 
> The problem is I have no way to test STRICT_ALIGNMENT or slow_unaligned_access
> So I had to hardcode some targets that I know it does work on.

I see.  But then the slow_unaligned_access implementation should use
non_strict_align as default somehow as SLOW_UNALIGNED_ACCESS is defaulted
to STRICT_ALIGN.

Given that SLOW_UNALIGNED_ACCESS has different values for different modes
it would also make sense to be more specific for the testcase in question,
like word_mode_slow_unaligned_access to tell this only applies to 
word_mode?

Thanks,
Richard.

> Thanks,
> Tamar
> > 
> > Otherwise the expr.c change looks ok.
> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > >
> > > gcc/
> > > 2017-11-14  Tamar Christina  
> > >
> > >   * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > >   with fast unaligned access.
> > >   * doc/sourcebuild.texi (no_slow_unalign): New.
> > >
> > > gcc/testsuite/
> > > 2017-11-14  Tamar Christina  
> > >
> > >   * gcc.dg/struct-simple.c: New.
> > >   * lib/target-supports.exp
> > >   (check_effective_target_no_slow_unalign): New.
> > >
> > >
> > 
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


lambda-switch regression

2017-11-15 Thread Nathan Sidwell
g++.dg/lambda/lambda-switch.C Has recently regressed.  It appears the 
location of a warning message has moved.


  l = []()  // { dg-warning "statement will never be 
executed" }
{
case 3: // { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the first. 
That seems not a useful.


I've not investigated what patch may have caused this, on the chance 
someone might already know?


nathan
--
Nathan Sidwell


Re: [patch] backwards threader cleanups

2017-11-15 Thread Pedro Alves
On 11/15/2017 07:34 AM, Aldy Hernandez wrote:
> 
> 
> On 11/14/2017 02:38 PM, David Malcolm wrote:
>> On Tue, 2017-11-14 at 14:08 -0500, Aldy Hernandez wrote:
> 
>>https://gcc.gnu.org/codingconventions.html#Class_Form
>> says that:
>>
>> "When defining a class, first [...]
>> declare all public member functions,
>> [...]
>> then declare all non-public member functions, and
>> then declare all non-public member variables."
> 
> Wow, I did not expect that order.  Fixed.

...

>> (Is this a self-assign from this->speed_p? should the "speed_p" param
>> be renamed, e.g. to "speed_p_")
> 
> Yes.  Fixed.

The convention also says:

"When structs and/or classes have member functions, prefer to name
data members with a leading m_".

So in this case, the preference would be to rename this->speed_p to
m_speed_p instead.

Thanks,
Pedro Alves



Re: [PATCH] Fix test-suite fallout of default -Wreturn-type.

2017-11-15 Thread Jonathan Wakely

On 06/11/17 15:12 +0100, Martin Liška wrote:

On 11/06/2017 02:58 PM, Paolo Carlini wrote:

Hi,

On 06/11/2017 14:37, Martin Liška wrote:

Thank you for the patch.
I'm going to install the remaining part that will fix x86_64 fallout. All 
changes are
quite obvious, so hope it's fine to install it.

I think so. Thanks.

Note that the 3 additional libstdc++-v3 changes aren't really necessary, but 
those testcases are failing, seg faulting, at run time for unrelated reasons. I 
don't know if Jonathan is already on that...

Paolo.


Right, adding "return 0;" to main() is just noise, it does nothing.


You're right, it started right when it was introduced in r254008.

I see:

g++ libstdc++-v3/testsuite/27_io/basic_ifstream/cons/char/path.cc -std=gnu++17 -I. 
-lstdc++fs && ./a.out
libstdc++-v3/testsuite/27_io/basic_ifstream/cons/char/path.cc:33: void 
test01(): Assertion 'f.is_open()' failed.
Aborted (core dumped)


I think that was PR libstdc++/82917 so should be fixed.



[PR c++/81574] lambda capture of function reference

2017-11-15 Thread Nathan Sidwell
This patch fixes 81574.  Even when the capture default is '=', a 
reference to a function is captured by reference.  The init-capture case 
captures a pointer, via auto deduction machinery.  AFAICT that's the 
correct behaviour.


applying to trunk.

nathan
--
Nathan Sidwell
2017-11-15  Nathan Sidwell  

	PR c++/81574
	* lambda.c (lambda_capture_field_type): Function references are
	always catured by reference.

	PR c++/81574
	* g++.dg/cpp1y/pr81574.C: New.

Index: cp/lambda.c
===
--- cp/lambda.c	(revision 254740)
+++ cp/lambda.c	(working copy)
@@ -245,7 +245,8 @@ lambda_capture_field_type (tree expr, bo
 {
   type = non_reference (unlowered_expr_type (expr));
 
-  if (!is_this && by_reference_p)
+  if (!is_this
+	  && (by_reference_p || TREE_CODE (type) == FUNCTION_TYPE))
 	type = build_reference_type (type);
 }
 
Index: testsuite/g++.dg/cpp1y/pr81574.C
===
--- testsuite/g++.dg/cpp1y/pr81574.C	(revision 0)
+++ testsuite/g++.dg/cpp1y/pr81574.C	(working copy)
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++14 } }
+// PR c++/81574 references to functions are captured by reference.
+
+// 8.1.5.2/10
+// For each entity captured by copy, ... an lvalue reference to the
+// referenced function type if the entity is a reference to a function
+
+void f (void (&b)())
+{
+  [=] {  b; } ();
+  [=, b(f)] { b; } ();
+  [=, b(b)] { b; } ();
+}


Re: [PATCH, rs6000] (v2) GIMPLE folding for vector compares

2017-11-15 Thread Segher Boessenkool
Hi Will,

On Tue, Nov 14, 2017 at 04:11:34PM -0600, Will Schmidt wrote:
>   Add support for gimple folding of vec_cmp_{eq,ge,gt,le,ne}
> for the integer data types.

The code looks fine, just some typographical stuff:

>   * config/rs6000/vsx.md (vcmpneb, vcmpneh, vcmpnew): Update to specify 
>   the not+eq combination.

Trailing space.

> +/*  Helper function to handle the gimple folding of a vector compare
> +operation.  This sets up true/false vectors, and uses the
> +VEC_COND_EXPR operation.
> +'code' indicates which comparison is to be made. (EQ, GT, ...).
> +'type' indicates the type of the result.  */

One space less in the comment indent.  Names of parameters are written in
CAPS, no quotes.

> +static void
> +fold_compare_helper (gimple_stmt_iterator *gsi, tree_code code, gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = gimple_call_arg (stmt, 1);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g = gimple_build_assign (lhs,
> + fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1));

That's not the standard indenting.  Maybe break the statement to make
it easier?  I.e.

  tree cmp = fold_build_vec_cmp (code, TREE_TYPE (lhs), arg0, arg1);
  gimple *g = gimple_build_assign (lhs, cmp);

> @@ -16701,10 +16731,67 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
>}
>  
> +/* Vector compares; EQ, NE, GE, GT, LE.  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:
> +case ALTIVEC_BUILTIN_VCMPEQUH:
> +case ALTIVEC_BUILTIN_VCMPEQUW:
> +case P8V_BUILTIN_VCMPEQUD:
> +  {
> + fold_compare_helper (gsi, EQ_EXPR, stmt);
> + return true;
> +  }

There's no need to make a block here (a bunch more of this later).

> @@ -18260,10 +18347,27 @@ builtin_function_type (machine_mode mode_ret, 
> machine_mode mode_arg0,
>  case MISC_BUILTIN_UNPACK_TD:
>  case MISC_BUILTIN_UNPACK_V1TI:
>h.uns_p[0] = 1;
>break;
>  
> +  /* unsigned arguments, bool return (compares).  */
> +case ALTIVEC_BUILTIN_VCMPEQUB:

The comment indent is wrong.

>/* unsigned arguments for 128-bit pack instructions.  */
>  case MISC_BUILTIN_PACK_TD:

Here too, but that is existing code :-)

Okay for trunk with those trivialities cleaned up.  Thanks!


Segher


[PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread H.J. Lu
-mzeroupper is specified to generate vzeroupper instruction.  If it
isn't used, the default should depend on !TARGET_AVX512ER.  Users can
always use -mzeroupper or -mno-zeroupper to override it.

Sebastian, can you run the full test with it?

OK for trunk if there is no regression?

Thanks.

H.J.
---
gcc/

PR target/82990
* config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
TARGET_AVX512ER check.
(ix86_option_override_internal): Set MASK_VZEROUPPER if
neither -mzeroupper nor -mno-zeroupper is used and AVX512ER is
disabled.

gcc/testsuite/

PR target/82990
* gcc.target/i386/pr82990-1.c: New test.
* gcc.target/i386/pr82990-2.c: Likewise.
* gcc.target/i386/pr82990-3.c: Likewise.
* gcc.target/i386/pr82990-4.c: Likewise.
---
 gcc/config/i386/i386.c|  5 +++--
 gcc/testsuite/gcc.target/i386/pr82990-1.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-2.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-3.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-4.c |  6 ++
 5 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5e84a09954..2c729236a29 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2497,7 +2497,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return TARGET_AVX && !TARGET_AVX512ER
+  return TARGET_AVX
 && TARGET_VZEROUPPER && flag_expensive_optimizations
 && !optimize_size;
 }
@@ -4666,7 +4666,8 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_SEH && TARGET_CALL_MS2SYSV_XLOGUES)
 sorry ("-mcall-ms2sysv-xlogues isn%'t currently supported with SEH");
 
-  if (!(opts_set->x_target_flags & MASK_VZEROUPPER))
+  if (!(opts_set->x_target_flags & MASK_VZEROUPPER)
+  && !TARGET_AVX512ER_P (opts->x_ix86_isa_flags))
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-1.c 
b/gcc/testsuite/gcc.target/i386/pr82990-1.c
new file mode 100644
index 000..ff1d6d40eb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=knl -mvzeroupper" } */
+
+#include 
+
+extern __m512d y, z;
+
+void
+pr82941 ()
+{
+  z = y;
+}
+
+/* { dg-final { scan-assembler-times "vzeroupper" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-2.c 
b/gcc/testsuite/gcc.target/i386/pr82990-2.c
new file mode 100644
index 000..0d3cb2333dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-2.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=skylake-avx512 -mno-vzeroupper" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-not "vzeroupper" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-3.c 
b/gcc/testsuite/gcc.target/i386/pr82990-3.c
new file mode 100644
index 000..201fa98d8d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-3.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -mavx512er -mvzeroupper -O2" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-times "vzeroupper" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr82990-4.c 
b/gcc/testsuite/gcc.target/i386/pr82990-4.c
new file mode 100644
index 000..09f161c7291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr82990-4.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512f -mno-avx512er -mno-vzeroupper -O2" } */
+
+#include "pr82941-1.c"
+
+/* { dg-final { scan-assembler-not "vzeroupper" } } */
-- 
2.13.6



Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c

2017-11-15 Thread Tom de Vries

Hi,

I noticed that there is only one asyncwait testcase for C on trunk.

I've rewritten asyncwait-{1,2,3}.f90 into C (and changed the float math 
into int math to keep things as simple as possible).


Tested on top of trunk for host.

Tested on top of trunk, gcc-7-branch, openacc-gcc-7-branch, 
gomp-4-branch for nvptx.


On trunk for nvptx, I'm seeing execution failures at -O2. I've verified 
that I see the same failures with all the async and wait clauses 
removed. Also, it's not the only failure at -O2 for trunk, so that's 
probably some orthogonal issue.


Committed as obvious.

Thanks,
- Tom
Add libgomp.oacc-c-c++-common/f-asyncwait-{1,2,3}.c

2017-11-15  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: New test, copied
	from asyncwait-1.f90.  Rewrite into C.  Rewrite from float to int.
	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-2.c: New test, copied
	from asyncwait-2.f90.  Rewrite into C.  Rewrite from float to int.
	* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-3.c: New test, copied
	from asyncwait-3.f90.  Rewrite into C.  Rewrite from float to int.

---
 .../libgomp.oacc-c-c++-common/f-asyncwait-1.c  | 297 +
 .../libgomp.oacc-c-c++-common/f-asyncwait-2.c  |  61 +
 .../libgomp.oacc-c-c++-common/f-asyncwait-3.c  |  63 +
 3 files changed, 421 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c
new file mode 100644
index 000..cf85170
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c
@@ -0,0 +1,297 @@
+/* { dg-do run } */
+
+/* Based on asyncwait-1.f90.  */
+
+#include 
+
+#define N 64
+
+int
+main (void)
+{
+  int *a, *b, *c, *d, *e;
+
+  a = (int*)malloc (N * sizeof (*a));
+  b = (int*)malloc (N * sizeof (*b));
+  c = (int*)malloc (N * sizeof (*c));
+  d = (int*)malloc (N * sizeof (*d));
+  e = (int*)malloc (N * sizeof (*e));
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+
+#pragma acc parallel async
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 3)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 2) abort ();
+  if (b[i] != 2) abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+  c[i] = 0;
+  d[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N]) copy (c[0:N]) copy (d[0:N])
+  {
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  b[i] = (a[i] * a[i] * a[i]) / a[i];
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  c[i] = (a[i] * 4) / a[i];
+
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 9)
+	abort ();
+  if (c[i] != 4)
+	abort ();
+  if (d[i] != 1)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+  c[i] = 0;
+  d[i] = 0;
+  e[i] = 0;
+}
+
+#pragma acc data copy (a[0:N], b[0:N], c[0:N], d[0:N], e[0:N])
+  {
+
+#pragma acc parallel async (1)
+for (int i = 0; i < N; ++i)
+  b[i] = (a[i] * a[i] * a[i]) / a[i];
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  c[i] = (a[i] * 4) / a[i];
+
+#pragma acc parallel async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i];
+
+
+#pragma acc parallel wait (1) async (1)
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  e[i] = a[i] + b[i] + c[i] + d[i];
+
+#pragma acc wait (1)
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 2)
+	abort ();
+  if (b[i] != 4)
+	abort ();
+  if (c[i] != 4)
+	abort ();
+  if (d[i] != 1)
+	abort ();
+  if (e[i] != 11)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 3;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+
+#pragma acc kernels async
+#pragma acc loop
+for (int i = 0; i < N; ++i)
+  b[i] = a[i];
+
+#pragma acc wait
+  }
+
+  for (int i = 0; i < N; ++i)
+{
+  if (a[i] != 3)
+	abort ();
+  if (b[i] != 3)
+	abort ();
+}
+
+  for (int i = 0; i < N; ++i)
+{
+  a[i] = 2;
+  b[i] = 0;
+}
+
+#pragma acc data copy (a[0:N]) copy (b[0:N])
+  {
+#pragma acc kernels async (1)
+#pragma acc loop
+for (in

Re: Add __builtin_tgmath for better tgmath.h implementation (bug 81156)

2017-11-15 Thread Joseph Myers
On Wed, 15 Nov 2017, Richard Biener wrote:

> Thanks - I suppose we can't avoid the repeated expansion by sth like
> 
> #define exp(Val) ({ __typeof__ Val tem = Val; __TGMATH_UNARY_REAL_IMAG
> (tem, exp, cexp); })

Well, that still expands its argument twice.  You'd need to use 
__auto_type to avoid the double expansion.  And then you'd still have 
extremely complicated expansions (that are correspondingly unfriendly if a 
user makes a mistake with a call, e.g. an argument of unsupported type), 
and complications around getting the right semantics when decimal floating 
point is involved.  And use of ({ }) doesn't work in sizeof outside 
functions.  And that wouldn't help with cases such as 
__STDC_TGMATH_OPERATOR_EVALUATION__, whereas it would actually be easy to 
add __builtin_tgmath_operator that's handled the same as __builtin_tgmath 
but ends up calling a function based on evaluation formats and producing 
an EXCESS_PRECISION_EXPR.

(Clang overloadable functions in C don't avoid the multiple expansion 
either, or at least Clang's tgmath.h doesn't.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix pr81706 tests on darwin

2017-11-15 Thread Dominique d'Humières
Committed as revision r254770.

Thanks for the review.

Dominique

> Le 13 nov. 2017 à 18:26, Mike Stump  a écrit :
> 
> On Nov 12, 2017, at 6:05 AM, Dominique d'Humières  wrote:
>> 
>> The following patch fixes pr81706 tests on darwin
>> 
>> --- ../_clean/gcc/testsuite/gcc.target/i386/pr81706.c2017-10-26 
>> 07:16:18.0 +0200
>> +++ gcc/testsuite/gcc.target/i386/pr81706.c  2017-11-11 16:02:36.0 
>> +0100
>> @@ -1,8 +1,8 @@
>> /* PR libstdc++/81706 */
>> /* { dg-do compile } */
>> /* { dg-options "-O3 -mavx2 -mno-avx512f" } */
>> -/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } } */
>> -/* { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } } */
>> +/* { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_cos" } } */
>> +/* { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_sin" } } */
>> 
>> #ifdef __cplusplus
>> extern "C" {
>> --- ../_clean/gcc/testsuite/g++.dg/ext/pr81706.C 2017-10-26 
>> 07:16:21.0 +0200
>> +++ gcc/testsuite/g++.dg/ext/pr81706.C   2017-11-09 21:41:36.0 
>> +0100
>> @@ -1,8 +1,8 @@
>> // PR libstdc++/81706
>> // { dg-do compile { target i?86-*-* x86_64-*-* } }
>> // { dg-options "-O3 -mavx2 -mno-avx512f" }
>> -// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_cos" } }
>> -// { dg-final { scan-assembler "call\[^\n\r]_ZGVdN4v_sin" } }
>> +// { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_cos" } }
>> +// { dg-final { scan-assembler "call\[^\n\r]__?ZGVdN4v_sin" } }
>> 
>> #ifdef __cplusplus
>> extern "C {
>> 
>> Is it OK?
> 
> Ok.



[PATCH] Improve -Wmaybe-uninitialized documentation

2017-11-15 Thread Jonathan Wakely

The docs for -Wmaybe-uninitialized have some issues:

- That first sentence is looong.
- Apparently some C++ programmers think "automatic variable" means one
 declared with C++11 `auto`, rather than simply a local variable.
- The sentence about only warning when optimizing is stuck in between
 two chunks talking about longjmp, which could be inferred to mean
 only the setjmp/longjmp part of the warning depends on optimization.

This attempts to make it easier to parse and understand.

OK for trunk?

commit a923e297acfd7c0ca3d3820463450f38230ab4ea
Author: Jonathan Wakely 
Date:   Wed Nov 15 14:25:09 2017 +

Improve -Wmaybe-uninitialized documentation

* doc/invoke.texi (-Wmaybe-uninitialized): Rephrase more accurately.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 44273284483..fac4122fe3e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4970,14 +4970,17 @@ void store (int *i)
 @item -Wmaybe-uninitialized
 @opindex Wmaybe-uninitialized
 @opindex Wno-maybe-uninitialized
-For an automatic variable, if there exists a path from the function
-entry to a use of the variable that is initialized, but there exist
-some other paths for which the variable is not initialized, the compiler
-emits a warning if it cannot prove the uninitialized paths are not
-executed at run time. These warnings are made optional because GCC is
-not smart enough to see all the reasons why the code might be correct
-in spite of appearing to have an error.  Here is one example of how
-this can happen:
+Warn if there exists a path from entry to a function to a use of an automatic
+(i.e.@ local) variable, for which the variable is not initialized, and the
+compiler cannot prove that the uninitialized path will not be executed at run
+time.
+
+These warnings are only possible in optimizing compilation, because otherwise
+GCC does not keep track of the state of variables.
+
+These warnings are optional because GCC is not smart enough to see all the
+reasons why the code might be correct in spite of appearing to have an error.
+Here is one example of how this can happen:
 
 @smallexample
 @group
@@ -5003,19 +5006,15 @@ warning, you need to provide a default case with 
assert(0) or
 similar code.
 
 @cindex @code{longjmp} warnings
-This option also warns when a non-volatile automatic variable might be
-changed by a call to @code{longjmp}.  These warnings as well are possible
-only in optimizing compilation.
-
-The compiler sees only the calls to @code{setjmp}.  It cannot know
-where @code{longjmp} will be called; in fact, a signal handler could
-call it at any point in the code.  As a result, you may get a warning
-even when there is in fact no problem because @code{longjmp} cannot
-in fact be called at the place that would cause a problem.
+This option also warns when a non-volatile automatic variable might be changed
+by a call to @code{longjmp}.  The compiler sees only the calls to
+@code{setjmp}.  It cannot know where @code{longjmp} will be called; in fact, a
+signal handler could call it at any point in the code.  As a result, you may
+get a warning even when there is in fact no problem because @code{longjmp}
+cannot in fact be called at the place that would cause a problem.
 
 Some spurious warnings can be avoided if you declare all the functions
-you use that never return as @code{noreturn}.  @xref{Function
-Attributes}.
+you use that never return as @code{noreturn}.  @xref{Function Attributes}.
 
 This warning is enabled by @option{-Wall} or @option{-Wextra}.
 


Re: [PATCH] Canonicalize constant multiplies in division

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra  
> wrote:

>>  (if (flag_reciprocal_math)
>> - /* Convert (A/B)/C to A/(B*C)  */
>> + /* Convert (A/B)/C to A/(B*C). */
>>   (simplify
>>    (rdiv (rdiv:s @0 @1) @2)
>> -   (rdiv @0 (mult @1 @2)))
>> +  (rdiv @0 (mult @1 @2)))
>> +
>> + /* Canonicalize x / (C1 * y) to (x * C2) / y.  */
>> + (if (optimize)
>
> why if (optimize) here?  The pattern you removed has no
> such check.  As discussed this may undo CSE of C1 * y
> so please check for a single-use on the mult with :s

I think that came from an earlier version of this patch. I've removed it
and added a single use check.

>> +  (simplify
>> +   (rdiv @0 (mult @1 REAL_CST@2))
>> +   (if (!real_zerop (@1))
>
> why this check?  The pattern below didn't have it.

Presumably to avoid the change when dividing by zero. I've removed it, here is
the updated version. This passes bootstrap and regress:


ChangeLog
2017-11-15  Wilco Dijkstra
Jackson Woodruff  

gcc/
PR 71026/tree-optimization
* match.pd: Canonicalize constant multiplies in division.

gcc/testsuite/
PR 71026/tree-optimization
* gcc.dg/cse_recip.c: New test.
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 
b5042b783c0830a2da08c44bed39842a17911844..ea7d90ed977cfff991d74bee54e91ecb209b6030
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -344,10 +344,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (negate @0)))
 
 (if (flag_reciprocal_math)
- /* Convert (A/B)/C to A/(B*C)  */
+ /* Convert (A/B)/C to A/(B*C). */
  (simplify
   (rdiv (rdiv:s @0 @1) @2)
-   (rdiv @0 (mult @1 @2)))
+  (rdiv @0 (mult @1 @2)))
+
+ /* Canonicalize x / (C1 * y) to (x * C2) / y.  */
+ (simplify
+  (rdiv @0 (mult:s @1 REAL_CST@2))
+  (with
+   { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @2); }
+   (if (tem)
+(rdiv (mult @0 { tem; } ) @1
 
  /* Convert A/(B/C) to (A/B)*C  */
  (simplify
@@ -646,15 +654,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (tem)
  (rdiv { tem; } @1)
 
-/* Convert C1/(X*C2) into (C1/C2)/X  */
-(simplify
- (rdiv REAL_CST@0 (mult @1 REAL_CST@2))
-  (if (flag_reciprocal_math)
-   (with
-{ tree tem = const_binop (RDIV_EXPR, type, @0, @2); }
-(if (tem)
- (rdiv { tem; } @1)
-
 /* Simplify ~X & X as zero.  */
 (simplify
  (bit_and:c (convert? @0) (convert? (bit_not @0)))
diff --git a/gcc/testsuite/gcc.dg/cse_recip.c b/gcc/testsuite/gcc.dg/cse_recip.c
new file mode 100644
index 
..88cba9930c0eb1fdee22a797eff110cd9a14fcda
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cse_recip.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-optimized-raw" } */
+
+void
+cse_recip (float x, float y, float *a)
+{
+  a[0] = y / (5 * x);
+  a[1] = y / (3 * x);
+  a[2] = y / x;
+}
+
+/* { dg-final { scan-tree-dump-times "rdiv_expr" 1 "optimized" } } */





[PATCH 1/3][middle-end]PR78809 (Inline strcmp with small constant strings)

2017-11-15 Thread Qing Zhao
Hi,

this is the first patch for PR78809 (totally 3 patches)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
inline strcmp with small constant strings

The design doc is at:
https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html

this patch is for the first part of change:

A. for strncmp (s1, s2, n)
 if one of "s1" or "s2" is a constant string, "n" is a constant, and
larger than the length of the constant string:
 change strncmp (s1, s2, n) to strcmp (s1, s2);

adding test case strcmpopt_1.c into gcc.dg

bootstraped and tested on both X86 and aarch64. no regression.

Okay for commit?

thanks.

Qing

==

gcc/ChangeLog

2017-11-15  Qing Zhao  

   * gimple-fold.c (gimple_fold_builtin_string_compare): Add handling
   of replacing call to strncmp with corresponding call to strcmp when
   meeting conditions.

gcc/testsuite/ChangeLog

2017-11-15  Qing Zhao  

   PR middle-end/78809
   * gcc.dg/strcmpopt_1.c: New test.

---
 gcc/gimple-fold.c  | 15 +++
 gcc/testsuite/gcc.dg/strcmpopt_1.c | 28 
 2 files changed, 43 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_1.c

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index adb6f3b..1ed6383 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -2258,6 +2258,21 @@ gimple_fold_builtin_string_compare (gimple_stmt_iterator 
*gsi)
   return true;
 }
 
+  /* If length is larger than the length of one constant string, 
+ replace strncmp with corresponding strcmp */ 
+  if (fcode == BUILT_IN_STRNCMP 
+  && length > 0
+  && ((p2 && (size_t) length > strlen (p2)) 
+  || (p1 && (size_t) length > strlen (p1
+{
+  tree fn = builtin_decl_implicit (BUILT_IN_STRCMP);
+  if (!fn)
+return false;
+  gimple *repl = gimple_build_call (fn, 2, str1, str2);
+  replace_call_with_call_and_fold (gsi, repl);
+  return true;
+}
+
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/strcmpopt_1.c 
b/gcc/testsuite/gcc.dg/strcmpopt_1.c
new file mode 100644
index 000..40596a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strcmpopt_1.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-fdump-tree-gimple" } */
+
+#include 
+#include 
+
+int cmp1 (char *p)
+{
+  return strncmp (p, "fis", 4);
+}
+int cmp2 (char *q)
+{
+  return strncmp ("fis", q, 4);
+}
+
+int main ()
+{
+
+  char *p = "fish";
+  char *q = "fis\0";
+
+  if (cmp1 (p) == 0 || cmp2 (q) != 0)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "strcmp \\(" 2 "gimple" } } */
-- 
1.9.1



[testsuite, committed] Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15 Thread Tom de Vries
[ Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy 
and strncat (PR 81117) ]


On 08/06/2017 10:07 PM, Martin Sebor wrote:

Part 3 of the series contains the meat of the patch: the new
-Wstringop-truncation option, and enhancements to -Wstringop-
overflow, and -Wpointer-sizeof-memaccess to detect misuses of
strncpy and strncat.

Martin

gcc-81117-3.diff


PR c/81117 - Improve buffer overflow checking in strncpy




gcc/testsuite/ChangeLog:

PR c/81117
* c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
* c-c++-common/Wstringop-overflow.c: Same.
* c-c++-common/Wstringop-truncation.c: Same.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust.
* c-c++-common/attr-nonstring-2.c: New test.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
* gcc.dg/torture/pr63554.c: Same.
* gcc.dg/Walloca-1.c: Disable macro tracking.



Hi,

this also caused a regression in strncpy-fix-1.c. I noticed it for nvptx 
 (but I also saw it in other test results, f.i. for 
x86_64-unknown-freebsd12.0 at 
https://gcc.gnu.org/ml/gcc-testresults/2017-11/msg01276.html ).


On linux you don't see this unless you add -Wsystem-headers:
...
$ gcc src/gcc/testsuite/gcc.dg/strncpy-fix-1.c 
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -Wall 
-Wsystem-headers -S -o strncpy-fix-1.s

In file included from /usr/include/string.h:630,
 from src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:6:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c: In function ‘f’:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:10:3: warning: 
‘__builtin_strncpy’ output truncated before terminating nul copying 2 
bytes from a string of the same length [-Wstringop-truncation]

...

Fixed by adding -Wno-stringop-truncation.

Committed as obvious.

Thanks,
- Tom
Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15  Tom de Vries  

	* gcc.dg/strncpy-fix-1.c: Add -Wno-stringop-truncation to dg-options.

---
 gcc/testsuite/gcc.dg/strncpy-fix-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/strncpy-fix-1.c b/gcc/testsuite/gcc.dg/strncpy-fix-1.c
index b8bc916..b4fd4aa 100644
--- a/gcc/testsuite/gcc.dg/strncpy-fix-1.c
+++ b/gcc/testsuite/gcc.dg/strncpy-fix-1.c
@@ -1,7 +1,7 @@
 /* Test that use of strncpy does not result in a "value computed is
not used" warning.  */
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wall" } */
+/* { dg-options "-O2 -Wall -Wno-stringop-truncation" } */
 
 #include 
 void


Re: [PATCH][ARM] Fix more -Wreturn-type fallout

2017-11-15 Thread Kyrill Tkachov

Hi Sudi,

On 10/11/17 17:06, Sudi Das wrote:


Hi

This patch fixes a couple of more tests that are giving out warnings 
with -Wreturn-type:

- g++.dg/ext/pr57735.C
- gcc.target/arm/pr54300.C



Thank you for the patch.
I've committed it on your behalf with r254773.

Kyrill


*** gcc/testsuite/ChangeLog ***

2017-11-10  Sudakshina Das  

* g++.dg/ext/pr57735.C: Add -Wno-return-type for test.
* gcc.target/arm/pr54300.C (main): Add return type and
return a value.




Re: [PATCH] Fix use-after-free in the strlen pass (PR tree-optimization/82977)

2017-11-15 Thread Martin Sebor

On 11/15/2017 01:28 AM, Richard Biener wrote:

On Tue, 14 Nov 2017, Jeff Law wrote:


On 11/14/2017 02:30 PM, Jakub Jelinek wrote:

On Tue, Nov 14, 2017 at 02:24:28PM -0700, Martin Sebor wrote:

On 11/14/2017 02:04 PM, Jakub Jelinek wrote:

Hi!

strlen_to_stridx.get (rhs1) returns an address into the hash_map, and
strlen_to_stridx.put (lhs, *ps); (in order to be efficient) doesn't make a
copy of the argument just in case, first inserts the slot into it which
may cause reallocation, and only afterwards runs the copy ctor to assign
the value into the new slot.  So, passing it a reference to something
in the hash_map is wrong.  Fixed thusly, bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?


This seems like an unnecessary gotcha that should be possible to
avoid in the hash_map.  The corresponding standard containers
require it to work and so it's surprising when it doesn't in GCC.

I've been looking at how this is implemented and it seems to me
that a fix should be doable by having the hash_map check to see if
the underlying table needs to expand and if so, create a temporary
copy of the element before reallocating it.


That would IMHO just slow down and enlarge the hash_map for all users,
even when most of them don't really need it.
While it is reasonable for STL containers to make sure it works, we
aren't using STL containers and can pose additional restrictions.

But when we make our containers behave differently than the STL it makes
it much easier for someone to make a mistake such as this one.

IMHO this kind of difference in behavior is silly and long term just
makes our jobs harder.

I'd vote for fixing our containers.


I'd argue that this is simply a programming error and I doubt the
libstdc++ variant works by design/specification.


It's by design.  You can find the discussion of this very issue
in C++ standard library issue 526:

  http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-closed.html#526

Martin


Re: [PATCH PR82726/PR70754][2/2]New fix by finding correct root reference in combined chains

2017-11-15 Thread Bin.Cheng
On Mon, Nov 13, 2017 at 1:20 PM, Richard Biener
 wrote:
> On Sat, Nov 11, 2017 at 11:19 AM, Bernhard Reutner-Fischer
>  wrote:
>> On Fri, Nov 10, 2017 at 02:14:25PM +, Bin.Cheng wrote:
>>> Hmm, the patch...
>>
>> +  /* Setup UID for all statements in dominance order.  */
>> +  basic_block *bbs = get_loop_body (loop);
>> +  for (i = 0; i < loop->num_nodes; i++)
>> +{
>> +  unsigned uid = 0;
>> +  basic_block bb = bbs[i];
>> +
>> +  for (gimple_stmt_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);
>> +  gsi_next (&bsi))
>> +   {
>> + gimple *stmt = gsi_stmt (bsi);
>> + if (!virtual_operand_p (gimple_phi_result (as_a (stmt
>> +   gimple_set_uid (stmt, uid);
>> +   }
>> +
>> +  for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi);
>> +  gsi_next (&bsi))
>> +   {
>> + gimple *stmt = gsi_stmt (bsi);
>> + if (gimple_code (stmt) != GIMPLE_LABEL && !is_gimple_debug (stmt))
>> +   gimple_set_uid (stmt, ++uid);
>> +   }
>>
>>   for (gimple_stmt_iterator bsi = gsi_start_nondebug_after_labels_bb 
>> (bb);
>>!gsi_end_p (bsi);
>>gsi_next_nondebug (&bsi))
>>  gimple_set_uid (gsi_stmt (bsi), ++uid);
>
> Or even better instead of the whole loop
>
> renumber_gimple_stmt_uids_in_blocks (bbs, loop->num_nodes);
>
> Ok with that change.
Right, here is the updated patch.  Will commit it later.

Thanks,
bin
>
> Thanks,
> Richard.
>
>> thanks,
>>
>> +}
>> +  free (bbs);
>>
From 28a21f4a86ed4e1b5a174b004c45bd4b8ede944f Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 1 Nov 2017 17:43:55 +
Subject: [PATCH 2/2] pr82726-2017.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c |  26 ++
 gcc/tree-predcom.c  | 138 
 2 files changed, 148 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr82726.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
new file mode 100644
index 000..22bc59d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82726.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 --param tree-reassoc-width=4" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
+
+#define N 40
+#define M 128
+unsigned int in[N+M];
+unsigned short out[N];
+
+/* Outer-loop vectorization. */
+
+void
+foo (){
+  int i,j;
+  unsigned int diff;
+
+  for (i = 0; i < N; i++) {
+diff = 0;
+for (j = 0; j < M; j+=8) {
+  diff += in[j+i];
+}
+out[i]=(unsigned short)diff;
+  }
+
+  return;
+}
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 24d7c9c..28dac82 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -1020,6 +1020,17 @@ order_drefs (const void *a, const void *b)
   return (*da)->pos - (*db)->pos;
 }
 
+/* Compares two drefs A and B by their position.  Callback for qsort.  */
+
+static int
+order_drefs_by_pos (const void *a, const void *b)
+{
+  const dref *const da = (const dref *) a;
+  const dref *const db = (const dref *) b;
+
+  return (*da)->pos - (*db)->pos;
+}
+
 /* Returns root of the CHAIN.  */
 
 static inline dref
@@ -2633,7 +2644,6 @@ combine_chains (chain_p ch1, chain_p ch2)
   bool swap = false;
   chain_p new_chain;
   unsigned i;
-  gimple *root_stmt;
   tree rslt_type = NULL_TREE;
 
   if (ch1 == ch2)
@@ -2675,31 +2685,55 @@ combine_chains (chain_p ch1, chain_p ch2)
   new_chain->refs.safe_push (nw);
 }
 
-  new_chain->has_max_use_after = false;
-  root_stmt = get_chain_root (new_chain)->stmt;
-  for (i = 1; new_chain->refs.iterate (i, &nw); i++)
-{
-  if (nw->distance == new_chain->length
-	  && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
-	{
-	  new_chain->has_max_use_after = true;
-	  break;
-	}
-}
-
   ch1->combined = true;
   ch2->combined = true;
   return new_chain;
 }
 
-/* Try to combine the CHAINS.  */
+/* Recursively update position information of all offspring chains to ROOT
+   chain's position information.  */
+
+static void
+update_pos_for_combined_chains (chain_p root)
+{
+  chain_p ch1 = root->ch1, ch2 = root->ch2;
+  dref ref, ref1, ref2;
+  for (unsigned j = 0; (root->refs.iterate (j, &ref)
+			&& ch1->refs.iterate (j, &ref1)
+			&& ch2->refs.iterate (j, &ref2)); ++j)
+ref1->pos = ref2->pos = ref->pos;
+
+  if (ch1->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch1);
+  if (ch2->type == CT_COMBINATION)
+update_pos_for_combined_chains (ch2);
+}
+
+/* Returns true if statement S1 dominates statement S2.  */
+
+static bool
+pcom_stmt_dominates_stmt_p (gimple *s1, gimple *s2)
+{
+  basic_block bb1 = gimple_bb (s1), bb2 = gimple_bb (s2);
+
+  if (!bb1 || s1 == s2)
+return true;
+
+  if (bb1 == bb2)
+return gimple_uid (s1) < gimple_uid (s2);
+
+  return dominated_by_p (CDI_DOMINATORS, bb2, bb1);
+}
+
+/* Try to combine the CHAINS in LOOP.  */
 
 static void
-try_combine_chai

Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Christophe Lyon
Hi Kyrill,


On 8 November 2017 at 19:34, Kyrill  Tkachov
 wrote:
>
> On 06/06/17 14:17, James Greenhalgh wrote:
>>
>> On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:
>>>
>>> Hi all,
>>>
>>> On top of the previous vec_merge simplifications [1] we can add this
>>> pattern to perform
>>> a store of a vec_concat of two 64-bit values in distinct registers as an
>>> STP.
>>> This avoids constructing such a vector explicitly in a register and
>>> storing it as
>>> a Q register.
>>> This way for the code in the testcase we can generate:
>>>
>>> construct_lane_1:
>>>  ldp d1, d0, [x0]
>>>  fmovd3, 1.0e+0
>>>  fmovd2, 2.0e+0
>>>  faddd4, d1, d3
>>>  faddd5, d0, d2
>>>  stp d4, d5, [x1, 32]
>>>  ret
>>>
>>> construct_lane_2:
>>>  ldp x2, x0, [x0]
>>>  add x3, x2, 1
>>>  add x4, x0, 2
>>>  stp x3, x4, [x1, 32]
>>>  ret
>>>
>>> instead of the current:
>>> construct_lane_1:
>>>  ldp d0, d1, [x0]
>>>  fmovd3, 1.0e+0
>>>  fmovd2, 2.0e+0
>>>  faddd0, d0, d3
>>>  faddd1, d1, d2
>>>  dup v0.2d, v0.d[0]
>>>  ins v0.d[1], v1.d[0]
>>>  str q0, [x1, 32]
>>>  ret
>>>
>>> construct_lane_2:
>>>  ldp x2, x3, [x0]
>>>  add x0, x2, 1
>>>  add x2, x3, 2
>>>  dup v0.2d, x0
>>>  ins v0.d[1], x2
>>>  str q0, [x1, 32]
>>>  ret
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> Ok for GCC 8?
>>
>> OK.
>
>
> Thanks, I've committed this and the other patches in this series after
> rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
> The only conflict from updating the patch was that I had to use the store_16
> attribute rather than
> the old store2 for the new define_insn. This is what I've committed with
> r254551.
>
> Sorry for the delay in committing.
>

I've noticed that the new tests fail when testing with -mabi=ilp32:
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)

Sorry if this has been reported before.

Christophe

> Kyrill
>
>
>> Thanks,
>> James
>>
>>> 2017-06-06  Kyrylo Tkachov  
>>>
>>>  * config/aarch64/aarch64-simd.md (store_pair_lanes):
>>>  New pattern.
>>>  * config/aarch64/constraints.md (Uml): New constraint.
>>>  * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
>>>  predicate.
>>>
>>> 2017-06-06  Kyrylo Tkachov  
>>>
>>>  * gcc.target/aarch64/store_v2vec_lanes.c: New test.
>>
>>
>


Re: [PATCH 02/14] Support for adding and stripping location_t wrapper nodes

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:11 +0100, Richard Biener wrote:
> On Wed, Nov 15, 2017 at 7:17 AM, Trevor Saunders  rg> wrote:
> > On Fri, Nov 10, 2017 at 04:45:17PM -0500, David Malcolm wrote:
> > > This patch provides a mechanism in tree.c for adding a wrapper
> > > node
> > > for expressing a location_t, for those nodes for which
> > > !CAN_HAVE_LOCATION_P, along with a new method of cp_expr.
> > > 
> > > It's called in later patches in the kit via that new method.
> > > 
> > > In this version of the patch, I use NON_LVALUE_EXPR for wrapping
> > > constants, and VIEW_CONVERT_EXPR for other nodes.
> > > 
> > > I also turned off wrapper nodes for EXCEPTIONAL_CLASS_P, for the
> > > sake
> > > of keeping the patch kit more minimal.
> > > 
> > > The patch also adds a STRIP_ANY_LOCATION_WRAPPER macro for
> > > stripping
> > > such nodes, used later on in the patch kit.
> > 
> > I happened to start reading this series near the end and was rather
> > confused by this macro since it changes variables in a rather
> > unhygienic
> > way.  Did you consider just defining a inline function to return
> > the
> > actual decl?  It seems like its not used that often so the slight
> > extra
> > syntax should be that big a deal compared to the explicitness.
> 
> Existing practice  (STRIP_NOPS & friends).  I'm fine either way,
> the patch looks good.
> 
> Eventually you can simplify things by doing less checking in
> location_wrapper_p, like only checking
> 
> +inline bool location_wrapper_p (const_tree exp)
> +{
> +  if ((TREE_CODE (exp) == NON_LVALUE_EXPR
> +   || (TREE_CODE (exp) == VIEW_CONVERT_EXPR
> +  && (TREE_TYPE (exp)
> + == TREE_TYPE (TREE_OPERAND (exp, 0)))
> +return true;
> +  return false;
> +}
> 
> and renaming to maybe_location_wrapper_p.  After all you can't really
> distinguish location wrappers from non-location wrappers?  (and why
> would you want to?)

That's the implementation I originally tried.

As noted in an earlier thread about this, the problem I ran into was
(in g++.dg/conversion/reinterpret1.C):

  // PR c++/15076

  struct Y { Y(int &); };

  int v;
  Y y1(reinterpret_cast(v));  // { dg-error "" }

where the "reinterpret_cast" has the same type as the VAR_DECL v,
and hence the argument to y1 is a NON_LVALUE_EXPR around a VAR_DECL,
where both have the same type, and hence location_wrapper_p () on the
cast would return true.

Compare with:

  Y y1(v);

where the argument "v" with a location wrapper is a VIEW_CONVERT_EXPR
around a VAR_DECL.

With the simpler conditions you suggested above, both are treated as
location wrappers (leading to the dg-error in the test failing),
whereas with the condition in the patch, only the latter is treated as
a location wrapper, and an error is correctly emitted for the dg-error.

Hope this sounds sane.  Maybe the function needs a more detailed
comment explaining this?

Thanks
Dave


> Thanks,
> Richard.
> 
> > Other than that the series seems reasonable, and I look forward to
> > having wrappers in more places.  I seem to remember something I
> > wanted
> > to warn about they would make much easier.
> > 
> > Thanks
> > 
> > Trev
> > 


Re: [PATCH] Simplify floating point comparisons

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:28 PM, Wilco Dijkstra  
> wrote:

>> +(if (flag_unsafe_math_optimizations)
>> +  /* Simplify (C / x op 0.0) to x op 0.0 for C > 0.  */
>> +  (for op (lt le gt ge)
>> +   neg_op (gt ge lt le)
>> +    (simplify
>> +  (op (rdiv REAL_CST@0 @1) real_zerop@2)
>> +  (switch
>> +   (if (real_less (&dconst0, TREE_REAL_CST_PTR (@0)))
>
> Note that real_less (0., +Inf) so I think you either need to check C is 
> 'normal'
> or ! HONOR_INFINITIES.

Yes, it was missing an explicit check for infinity, now added.

> There's also the underflow issue I guess this is what 
> -funsafe-math-optimizations
> is for.  I think ignoring underflows is dangerous though.

We could change C / x > 0 to x >= 0 so the underflow case is included.
However that still means x == 0.0 would behave differently - so the question is
what exactly does -funsafe-math-optimization allow?


>> + (for cmp (lt le gt ge)
>> +  neg_cmp (gt ge lt le)
>> +  /* Simplify (x * C1) cmp C2 -> x cmp (C2 / C1), where C1 != 0.  */
>> +  (simplify
>> +   (cmp (mult @0 REAL_CST@1) REAL_CST@2)
>> +   (with
>> +    { tree tem = const_binop (RDIV_EXPR, type, @2, @1); }
>> +    (if (tem)
>> + (switch
>> +  (if (real_less (&dconst0, TREE_REAL_CST_PTR (@1)))
>> +   (cmp @0 { tem; }))
>> +  (if (real_less (TREE_REAL_CST_PTR (@1), &dconst0))
>> +   (neg_cmp @0 { tem; })))
>
>
> Drops possible overflow/underflow in x * C1 and may create underflow
> or overflow with C2/C1 which you should detect here at least.

I've added checks for this, however I thought -funsafe-math-optimizations is
allowed to insert/remove underflow/overflow, like in these cases:

(x * 1e20f) * 1e20f and (x * 1e40f) * 1e-30f.

> Existing overflows may be guarded against with a HONOR_INFINITIES check.

Not sure what you mean with this?

> When overflow/underflow can be disregarded is there any reason remaining to
> make this guarded by flag_unsafe_math_optimizations?  Are there any cases
> where rounding issues can flip the comparison result?

I think it needs to remain under -funsafe-math-optimizations. Here is the 
updated
version:


ChangeLog
2017-11-15  Wilco Dijkstra
Jackson Woodruff  

gcc/
PR 71026/tree-optimization
* match.pd: Simplify floating point comparisons.

gcc/testsuite/
PR 71026/tree-optimization
* gcc.dg/associate_comparison_1.c: New.
--

diff --git a/gcc/match.pd b/gcc/match.pd
index 
4d56847d6889923938625beb579b7bbb0cbbad91..967dbf8946fd12a161330f4c8b58dada5d9cb871
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -359,6 +359,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (rdiv @0 (negate @1))
  (rdiv (negate @0) @1))
 
+(if (flag_unsafe_math_optimizations)
+  /* Simplify (C / x op 0.0) to x op 0.0 for C > 0.  */
+  (for op (lt le gt ge)
+   neg_op (gt ge lt le)
+(simplify
+  (op (rdiv REAL_CST@0 @1) real_zerop@2)
+  (if (!REAL_VALUE_ISINF (TREE_REAL_CST (@0)))
+   (switch
+   (if (real_less (&dconst0, TREE_REAL_CST_PTR (@0)))
+(op @1 @2))
+   /* For C < 0, use the inverted operator.  */
+   (if (real_less (TREE_REAL_CST_PTR (@0), &dconst0))
+(neg_op @1 @2)))
+
 /* Optimize (X & (-A)) / A where A is a power of 2, to X >> log2(A) */
 (for div (trunc_div ceil_div floor_div round_div exact_div)
  (simplify
@@ -3703,6 +3717,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(rdiv @2 @1))
(rdiv (op @0 @2) @1)))
 
+ (for cmp (lt le gt ge)
+  neg_cmp (gt ge lt le)
+  /* Simplify (x * C1) cmp C2 -> x cmp (C2 / C1), where C1 != 0.  */
+  (simplify
+   (cmp (mult @0 REAL_CST@1) REAL_CST@2)
+   (with
+{ tree tem = const_binop (RDIV_EXPR, type, @2, @1); }
+(if (tem
+&& !(REAL_VALUE_ISINF (TREE_REAL_CST (tem))
+ || (real_zerop (tem) && !real_zerop (@1
+ (switch
+  (if (real_less (&dconst0, TREE_REAL_CST_PTR (@1)))
+   (cmp @0 { tem; }))
+  (if (real_less (TREE_REAL_CST_PTR (@1), &dconst0))
+   (neg_cmp @0 { tem; })))
+
  /* Simplify sqrt(x) * sqrt(y) -> sqrt(x*y).  */
  (for root (SQRT CBRT)
   (simplify
diff --git a/gcc/testsuite/gcc.dg/associate_comparison_1.c 
b/gcc/testsuite/gcc.dg/associate_comparison_1.c
new file mode 100644
index 
..ceaba334cce770eb1cbec9283ba8a0c64f725630
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/associate_comparison_1.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -funsafe-math-optimizations -fdump-tree-optimized-raw" } 
*/
+
+int
+cmp_mul_1 (float x)
+{
+  return x * 3 <= 100;
+}
+
+int
+cmp_mul_2 (float x)
+{
+  return x * -5 > 100;
+}
+
+int
+div_cmp_1 (float x, float y)
+{
+  return x / 3 <= y;
+}
+
+int
+div_cmp_2 (float x, float y)
+{
+  return x / 3 <= 1;
+}
+
+int
+inv_cmp (float x)
+{
+  return 5 / x >= 0;
+}
+
+/* { dg-final { scan-tree-dump-times "mult_expr" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "rdiv_expr" "optimized" } } */



[PATCH] make canonicalize_condition keep its promise

2017-11-15 Thread Aaron Sawdey
So, the story of this very small patch starts with me adding patterns
for ppc instructions bdz[tf] and bdnz[tf] such as this:

  [(set (pc)
(if_then_else
  (and
 (ne (match_operand:P 1 "register_operand" "c,*b,*b,*b")
 (const_int 1))
 (match_operator 3 "branch_comparison_operator"
  [(match_operand 4 "cc_reg_operand" "y,y,y,y")
   (const_int 0)]))
  (label_ref (match_operand 0))
  (pc)))
   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*wi*c*l")
(plus:P (match_dup 1)
(const_int -1)))
   (clobber (match_scratch:P 5 "=X,X,&r,r"))
   (clobber (match_scratch:CC 6 "=X,&y,&y,&y"))
   (clobber (match_scratch:CCEQ 7 "=X,&y,&y,&y"))]

However when this gets to the loop_doloop pass, we get an assert fail
in iv_number_of_iterations():

  gcc_assert (COMPARISON_P (condition));

This is happening because this branch insn tests two things ANDed
together so the and is at the top of the expression, not a comparison.

This condition is extracted from the insn by get_condition() which is
pretty straightforward, and which calls canonicalize_condition() before
returning it. Now, one could put a test for a jump condition that is
not a conditional test in here but the comment for
canonicalize_condition() says:

   (1) The code will always be a comparison operation (EQ, NE, GT, etc.).

So, this patch adds a test at the end that just returns 0 if the return
rtx is not a comparison. As it happens, doloop conversion is not needed
here because I'm already generating rtl for a branch-decrement counter
based loop.

If there is a better way to go about this please let me know and I'll
revise/retest.

Bootstrap and regtest pass on ppc64le and x86_64. Ok for trunk?

Thanks,
Aaron


2017-11-15  Aaron Sawdey  

* rtlanal.c (canonicalize_condition): Return 0 if final rtx
does not have a conditional at the top.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/rtlanal.c
===
--- gcc/rtlanal.c   (revision 254553)
+++ gcc/rtlanal.c   (working copy)
@@ -5623,7 +5623,11 @@
   if (CC0_P (op0))
 return 0;
 
-  return gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
+  /* We promised to return a comparison.  */
+  rtx ret = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
+  if (COMPARISON_P (ret))
+return ret;
+  return 0;
 }
 
 /* Given a jump insn JUMP, return the condition that will cause it to branch


Re: [testsuite, committed] Compile strncpy-fix-1.c with -Wno-stringop-truncation

2017-11-15 Thread Martin Sebor

On 11/15/2017 08:12 AM, Tom de Vries wrote:

[ Re: [PATCH 3/4] enhance overflow and truncation detection in strncpy
and strncat (PR 81117) ]

On 08/06/2017 10:07 PM, Martin Sebor wrote:

Part 3 of the series contains the meat of the patch: the new
-Wstringop-truncation option, and enhancements to -Wstringop-
overflow, and -Wpointer-sizeof-memaccess to detect misuses of
strncpy and strncat.

Martin

gcc-81117-3.diff


PR c/81117 - Improve buffer overflow checking in strncpy




gcc/testsuite/ChangeLog:

PR c/81117
* c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
* c-c++-common/Wstringop-overflow.c: Same.
* c-c++-common/Wstringop-truncation.c: Same.
* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust.
* c-c++-common/attr-nonstring-2.c: New test.
* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Adjust.
* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
* gcc.dg/torture/pr63554.c: Same.
* gcc.dg/Walloca-1.c: Disable macro tracking.



Hi,

this also caused a regression in strncpy-fix-1.c. I noticed it for nvptx
 (but I also saw it in other test results, f.i. for
x86_64-unknown-freebsd12.0 at
https://gcc.gnu.org/ml/gcc-testresults/2017-11/msg01276.html ).

On linux you don't see this unless you add -Wsystem-headers:


Yes, some Glibc versions (I think 2.24 and prior) define strncpy
as a macro.  The macro has been removed from newer versions, which
makes the warning show up inconsistently.  I test on Fedora 25 with
the older Glibc so I don't see all these warnings.

I'm tracking the problem bug 82944.


...
$ gcc src/gcc/testsuite/gcc.dg/strncpy-fix-1.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -Wall
-Wsystem-headers -S -o strncpy-fix-1.s
In file included from /usr/include/string.h:630,
 from src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:6:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c: In function ‘f’:
src/gcc/testsuite/gcc.dg/strncpy-fix-1.c:10:3: warning:
‘__builtin_strncpy’ output truncated before terminating nul copying 2
bytes from a string of the same length [-Wstringop-truncation]
...

Fixed by adding -Wno-stringop-truncation.

Committed as obvious.


Thanks
Martin


RE: [PATCH][GCC][mid-end] Allow larger copies when target supports unaligned access [Patch (1/2)]

2017-11-15 Thread tamar . christina
> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Wednesday, November 15, 2017 12:50
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> i...@airs.com
> Subject: RE: [PATCH][GCC][mid-end] Allow larger copies when target
> supports unaligned access [Patch (1/2)]
> 
> On Wed, 15 Nov 2017, Tamar Christina wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Richard Biener [mailto:rguent...@suse.de]
> > > Sent: Wednesday, November 15, 2017 08:24
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> > > i...@airs.com
> > > Subject: Re: [PATCH][GCC][mid-end] Allow larger copies when target
> > > supports unaligned access [Patch (1/2)]
> > >
> > > On Tue, 14 Nov 2017, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch allows larger bitsizes to be used as copy size when the
> > > > target does not have SLOW_UNALIGNED_ACCESS.
> > > >
> > > > fun3:
> > > > adrpx2, .LANCHOR0
> > > > add x2, x2, :lo12:.LANCHOR0
> > > > mov x0, 0
> > > > sub sp, sp, #16
> > > > ldrhw1, [x2, 16]
> > > > ldrbw2, [x2, 18]
> > > > add sp, sp, 16
> > > > bfi x0, x1, 0, 8
> > > > ubfxx1, x1, 8, 8
> > > > bfi x0, x1, 8, 8
> > > > bfi x0, x2, 16, 8
> > > > ret
> > > >
> > > > is turned into
> > > >
> > > > fun3:
> > > > adrpx0, .LANCHOR0
> > > > add x0, x0, :lo12:.LANCHOR0
> > > > sub sp, sp, #16
> > > > ldrhw1, [x0, 16]
> > > > ldrbw0, [x0, 18]
> > > > strhw1, [sp, 8]
> > > > strbw0, [sp, 10]
> > > > ldr w0, [sp, 8]
> > > > add sp, sp, 16
> > > > ret
> > > >
> > > > which avoids the bfi's for a simple 3 byte struct copy.
> > > >
> > > > Regression tested on aarch64-none-linux-gnu and
> > > > x86_64-pc-linux-gnu and
> > > no regressions.
> > > >
> > > > This patch is just splitting off from the previous combined patch
> > > > with
> > > > AArch64 and adding a testcase.
> > > >
> > > > I assume Jeff's ACK from
> > > > https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01523.html is still
> > > > valid as
> > > the code did not change.
> > >
> > > Given your no_slow_unalign isn't mode specific can't you use the
> > > existing non_strict_align?
> >
> > No because non_strict_align checks if the target supports unaligned
> > access at all,
> >
> > This no_slow_unalign corresponds instead to the target
> > slow_unaligned_access which checks that the access you want to make
> > has a greater cost than doing an aligned access. ARM for instance
> > always return 1 (value of STRICT_ALIGNMENT) for slow_unaligned_access
> > while for non_strict_align it may return 0 or 1 based on the options
> provided to the compiler.
> >
> > The problem is I have no way to test STRICT_ALIGNMENT or
> > slow_unaligned_access So I had to hardcode some targets that I know it
> does work on.
> 
> I see.  But then the slow_unaligned_access implementation should use
> non_strict_align as default somehow as SLOW_UNALIGNED_ACCESS is
> defaulted to STRICT_ALIGN.
> 
> Given that SLOW_UNALIGNED_ACCESS has different values for different
> modes it would also make sense to be more specific for the testcase in
> question, like word_mode_slow_unaligned_access to tell this only applies to
> word_mode?

Ah, that's fair enough. I've updated the patch and the new changelog is:


gcc/
2017-11-15  Tamar Christina  

* expr.c (copy_blkmode_to_reg): Fix bitsize for targets
with fast unaligned access.
* doc/sourcebuild.texi (word_mode_no_slow_unalign): New.

gcc/testsuite/
2017-11-15  Tamar Christina  

* gcc.dg/struct-simple.c: New.
* lib/target-supports.exp
(check_effective_target_word_mode_no_slow_unalign): New.

Ok for trunk?

Thanks,
Tamar

> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Tamar
> > >
> > > Otherwise the expr.c change looks ok.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > >
> > > > gcc/
> > > > 2017-11-14  Tamar Christina  
> > > >
> > > > * expr.c (copy_blkmode_to_reg): Fix bitsize for targets
> > > > with fast unaligned access.
> > > > * doc/sourcebuild.texi (no_slow_unalign): New.
> > > >
> > > > gcc/testsuite/
> > > > 2017-11-14  Tamar Christina  
> > > >
> > > > * gcc.dg/struct-simple.c: New.
> > > > * lib/target-supports.exp
> > > > (check_effective_target_no_slow_unalign): New.
> > > >
> > > >
> > >
> > > --
> > > Richard Biener 
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > Norton, HRB 21284 (AG Nuernberg)
> >
> >
> 
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 1646d0a99911aa7b2e66762e5

Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Kyrill Tkachov

Hi Christophe,

On 15/11/17 15:31, Christophe Lyon wrote:

Hi Kyrill,


On 8 November 2017 at 19:34, Kyrill  Tkachov
 wrote:

On 06/06/17 14:17, James Greenhalgh wrote:

On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:

Hi all,

On top of the previous vec_merge simplifications [1] we can add this
pattern to perform
a store of a vec_concat of two 64-bit values in distinct registers as an
STP.
This avoids constructing such a vector explicitly in a register and
storing it as
a Q register.
This way for the code in the testcase we can generate:

construct_lane_1:
  ldp d1, d0, [x0]
  fmovd3, 1.0e+0
  fmovd2, 2.0e+0
  faddd4, d1, d3
  faddd5, d0, d2
  stp d4, d5, [x1, 32]
  ret

construct_lane_2:
  ldp x2, x0, [x0]
  add x3, x2, 1
  add x4, x0, 2
  stp x3, x4, [x1, 32]
  ret

instead of the current:
construct_lane_1:
  ldp d0, d1, [x0]
  fmovd3, 1.0e+0
  fmovd2, 2.0e+0
  faddd0, d0, d3
  faddd1, d1, d2
  dup v0.2d, v0.d[0]
  ins v0.d[1], v1.d[0]
  str q0, [x1, 32]
  ret

construct_lane_2:
  ldp x2, x3, [x0]
  add x0, x2, 1
  add x2, x3, 2
  dup v0.2d, x0
  ins v0.d[1], x2
  str q0, [x1, 32]
  ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

OK.


Thanks, I've committed this and the other patches in this series after
rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
The only conflict from updating the patch was that I had to use the store_16
attribute rather than
the old store2 for the new define_insn. This is what I've committed with
r254551.

Sorry for the delay in committing.


I've noticed that the new tests fail when testing with -mabi=ilp32:
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)

Sorry if this has been reported before.


Thank you for reporting this, I was not aware of it.
My patch does indeed fail to generate the optimised sequence for 
-mabi=ilp32.

During combine it fails to match:
Failed to match this instruction:
(set (mem:V2DF (plus:DI (reg/v/f:DI 79 [ z ])
(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 
A128])

(vec_concat:V2DF (reg:DF 81 [ y0 ])
(reg:DF 84 [ y1 ])))


but without the -mabi=ilp32 it does successfully match the equivalent

(set (mem:V2DF (plus:DI (reg:DI 1 x1 [ z ])
(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 
A128])

(vec_concat:V2DF (reg:DF 81 [ y0 ])
(reg:DF 84 [ y1 ])))

The only difference is the index register being the hard reg x1.
There's probably some subtlety in aarch64_classify_address that I'll 
need to dig into.

In any case, can you please open a bug report for this so we can track it?
To be clear, the failure is just suboptimal codegen for the -mabi=ilp32 
case, not a wrong-code or ICE

(though it should still be fixed, of course).

Thanks again,
Kyrill


Christophe


Kyrill



Thanks,
James


2017-06-06  Kyrylo Tkachov  

  * config/aarch64/aarch64-simd.md (store_pair_lanes):
  New pattern.
  * config/aarch64/constraints.md (Uml): New constraint.
  * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New
  predicate.

2017-06-06  Kyrylo Tkachov  

  * gcc.target/aarch64/store_v2vec_lanes.c: New test.






RE: [PATCH][GCC][ARM] Implement "arch" GCC pragma and "+" attributes [Patch (2/3)]

2017-11-15 Thread Tamar Christina


> -Original Message-
> From: Kyrill Tkachov [mailto:kyrylo.tkac...@foss.arm.com]
> Sent: Wednesday, November 15, 2017 10:11
> To: Tamar Christina ; Sandra Loosemore
> ; gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com
> Subject: Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and
> "+" attributes [Patch (2/3)]
> 
> Hi Tamar,
> 
> On 10/11/17 10:56, Tamar Christina wrote:
> > Hi Sandra,
> >
> > I've respun the patch with the docs changes you requested.
> >
> > Regards,
> > Tamar
> >
> > > -Original Message-
> > > From: Sandra Loosemore [mailto:san...@codesourcery.com]
> > > Sent: 07 November 2017 03:38
> > > To: Tamar Christina; gcc-patches@gcc.gnu.org
> > > Cc: nd; Ramana Radhakrishnan; Richard Earnshaw; ni...@redhat.com;
> > > Kyrylo Tkachov
> > > Subject: Re: [PATCH][GCC][ARM] Implement "arch" GCC pragma and
> > > "+" attributes [Patch (2/3)]
> > >
> > > On 11/06/2017 09:50 AM, Tamar Christina wrote:
> > > > Hi All,
> > > >
> > > > This patch adds support for the setting the architecture and
> > > > extensions using the target GCC pragma.
> > > >
> > > > #pragma GCC target ("arch=armv8-a+crc")
> > > >
> > > > It also supports a short hand where an extension is just added to
> > > > the current architecture without changing it
> > > >
> > > > #pragma GCC target ("+crc")
> > > >
> > > > Popping and pushing options also correctly reconfigure the global
> > > > state as expected.
> > > >
> > > > Also supported is using the __attribute__((target("...")))
> > > > attributes on functions to change the architecture or extension.
> > > >
> > > > Regtested on arm-none-eabi and no regressions.
> 
> This will need a bootstrap and test run on arm-none-linux-gnueabihf (like all
> arm changes).
> Your changelog at
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00387.html mentions some
> arm-c.c changes but I don't see any included in this patch?
> 
> The other changes look good and in line with what I would expect, but can
> you please post the arm-c.c changes if there are any?

Hi Kyrill,

Sorry, I moved this change into another patch that was already committed and 
forgot
to remove it from this changelog.

I've also already bootstrapped this and the 3rd patch when you asked for the 
bootstrap of
First one.  But I didn't update mail the status to the list.

The correct changelog is 

gcc/
2017-11-15  Tamar Christina  

PR target/82641
* config/arm/arm.c (arm_valid_target_attribute_rec):
Parse "arch=" and "+".
(arm_valid_target_attribute_tree): Re-init global options.
(arm_option_override): Make non-static.
(arm_options_perform_arch_sanity_checks): Make errors fatal.
* config/arm/arm_acle.h (__ARM_FEATURE_CRC32): Replace with pragma.
* doc/extend.texi (ARM Function Attributes): Add pragma and target.

gcc/testsuite/
2017-11-15  Tamar Christina  

PR target/82641
* gcc.target/arm/pragma_arch_attribute.c: New.

Ok for trunk?

Thanks,
Tamar

> 
> Thanks,
> Kyrill
> 
> > > >
> > > > Ok for trunk?
> > > >
> > > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index
> > > >
> > >
> 8aa443f87fb700f7a723d736bdbd53b6c839656d..18d0ffa6820326ce7badf33001
> > > b1
> > > > c6a467c95883 100644
> > > > --- a/gcc/doc/extend.texi
> > > > +++ b/gcc/doc/extend.texi
> > > > @@ -3858,6 +3858,42 @@ Specifies the fpu for which to tune the
> > > performance of this function.
> > > >  The behavior and permissible arguments are the same as for the
> > > > @option{-mfpu=}  command-line option.
> > > >
> > > > +@item arch=
> > > > +@cindex @code{arch=} function attribute, ARM Specifies the
> > > > +architecture version and architectural extensions to use for this
> > > > +function.  The behavior and permissible arguments are the same as
> > > > +for the @option{-march=} command-line option.
> > > > +
> > > > +The above target attributes can be specified as follows:
> > > > +
> > > > +@smallexample
> > > > +__attribute__((target("@var{attr-string}")))
> > > > +int
> > > > +f (int a)
> > > > +@{
> > > > +  return a + 5;
> > > > +@}
> > > > +@end smallexample
> > > > +
> > > > +where @code{@var{attr-string}} is one of the attribute strings.
> > >
> > > This example doesn't illustrate anything useful, and in fact just
> > confuses
> > > things by introducing @var{attr-string}.  Please use an actual valid
> > attribute
> > > here, something like "arch=armv8-a" or whatever.
> > >
> > > Also, either kill the sentence fragment after the example, or be
> > careful to
> > > add @noindent before it to indicate it's a continuation of the
> > > previous paragraph.
> > >
> > > > +
> > > > +Additionally, the architectural extension string may be specified
> > > > +on its own.  This can be used to turn on and off particular
> > > > +architectural extensions without having to specify a particular
> > architecture
> > > version or core.  Example:
> > > > +
> > > > +@smallexample
> > > > +__attribute__((target

Re: [patches] Re: [PATCH] RISC-V: Add Jim Wilson as a maintainer

2017-11-15 Thread Palmer Dabbelt

On Tue, 07 Nov 2017 09:53:12 PST (-0800), Palmer Dabbelt wrote:

On Tue, 07 Nov 2017 09:47:37 PST (-0800), Jim Wilson wrote:

On Mon, Nov 6, 2017 at 6:39 PM, Palmer Dabbelt  wrote:


+riscv port Jim Wilson  



It is jimw not jim for the email address.  Please fix.


Sorry.  We're still pending approval, but

diff --git a/MAINTAINERS b/MAINTAINERS
index 9c3a56ea0941..222dad81f2bb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -93,8 +93,9 @@ pdp11 portPaul Koning 
 picochip port  Daniel Towner   
 powerpcspe portAndrew Jenner   

 riscv port Kito Cheng  
-riscv port Palmer Dabbelt  
+riscv port Palmer Dabbelt  
 riscv port Andrew Waterman 
+riscv port Jim Wilson  
 rl78 port  DJ Delorie  
 rs6000/powerpc portDavid Edelsohn  
 rs6000/powerpc portSegher Boessenkool  


Committed.


Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
> -mzeroupper is specified to generate vzeroupper instruction.  If it
> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
> always use -mzeroupper or -mno-zeroupper to override it.
>
> Sebastian, can you run the full test with it?
>
> OK for trunk if there is no regression?

If we want to go this way, please add relevant tune flag (e.g.
X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
property of the processor model, not ISA.

Uros.


Re: [PATCH][AArch64] Add STP pattern to store a vec_concat of two 64-bit registers

2017-11-15 Thread Christophe Lyon
On 15 November 2017 at 16:58, Kyrill  Tkachov
 wrote:
> Hi Christophe,
>
>
> On 15/11/17 15:31, Christophe Lyon wrote:
>>
>> Hi Kyrill,
>>
>>
>> On 8 November 2017 at 19:34, Kyrill  Tkachov
>>  wrote:
>>>
>>> On 06/06/17 14:17, James Greenhalgh wrote:

 On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:
>
> Hi all,
>
> On top of the previous vec_merge simplifications [1] we can add this
> pattern to perform
> a store of a vec_concat of two 64-bit values in distinct registers as
> an
> STP.
> This avoids constructing such a vector explicitly in a register and
> storing it as
> a Q register.
> This way for the code in the testcase we can generate:
>
> construct_lane_1:
>   ldp d1, d0, [x0]
>   fmovd3, 1.0e+0
>   fmovd2, 2.0e+0
>   faddd4, d1, d3
>   faddd5, d0, d2
>   stp d4, d5, [x1, 32]
>   ret
>
> construct_lane_2:
>   ldp x2, x0, [x0]
>   add x3, x2, 1
>   add x4, x0, 2
>   stp x3, x4, [x1, 32]
>   ret
>
> instead of the current:
> construct_lane_1:
>   ldp d0, d1, [x0]
>   fmovd3, 1.0e+0
>   fmovd2, 2.0e+0
>   faddd0, d0, d3
>   faddd1, d1, d2
>   dup v0.2d, v0.d[0]
>   ins v0.d[1], v1.d[0]
>   str q0, [x1, 32]
>   ret
>
> construct_lane_2:
>   ldp x2, x3, [x0]
>   add x0, x2, 1
>   add x2, x3, 2
>   dup v0.2d, x0
>   ins v0.d[1], x2
>   str q0, [x1, 32]
>   ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?

 OK.
>>>
>>>
>>> Thanks, I've committed this and the other patches in this series after
>>> rebasing and rebootstrapping and testing on aarch64-none-linux-gnu.
>>> The only conflict from updating the patch was that I had to use the
>>> store_16
>>> attribute rather than
>>> the old store2 for the new define_insn. This is what I've committed with
>>> r254551.
>>>
>>> Sorry for the delay in committing.
>>>
>> I've noticed that the new tests fail when testing with -mabi=ilp32:
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
>> stp\td[0-9]+, d[0-9]+ 1 (found 0 times)
>> FAIL:gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times
>> stp\tx[0-9]+, x[0-9]+ 1 (found 0 times)
>>
>> Sorry if this has been reported before.
>
>
> Thank you for reporting this, I was not aware of it.
> My patch does indeed fail to generate the optimised sequence for
> -mabi=ilp32.
> During combine it fails to match:
> Failed to match this instruction:
> (set (mem:V2DF (plus:DI (reg/v/f:DI 79 [ z ])
> (const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
> (vec_concat:V2DF (reg:DF 81 [ y0 ])
> (reg:DF 84 [ y1 ])))
>
>
> but without the -mabi=ilp32 it does successfully match the equivalent
>
> (set (mem:V2DF (plus:DI (reg:DI 1 x1 [ z ])
> (const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
> (vec_concat:V2DF (reg:DF 81 [ y0 ])
> (reg:DF 84 [ y1 ])))
>
> The only difference is the index register being the hard reg x1.
> There's probably some subtlety in aarch64_classify_address that I'll need to
> dig into.
> In any case, can you please open a bug report for this so we can track it?

Sure, that's: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83009


> To be clear, the failure is just suboptimal codegen for the -mabi=ilp32
> case, not a wrong-code or ICE
> (though it should still be fixed, of course).
>
> Thanks again,
> Kyrill
>
>
>> Christophe
>>
>>> Kyrill
>>>
>>>
 Thanks,
 James

> 2017-06-06  Kyrylo Tkachov  
>
>   * config/aarch64/aarch64-simd.md (store_pair_lanes):
>   New pattern.
>   * config/aarch64/constraints.md (Uml): New constraint.
>   * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand):
> New
>   predicate.
>
> 2017-06-06  Kyrylo Tkachov  
>
>   * gcc.target/aarch64/store_v2vec_lanes.c: New test.


>


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Sebor

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.




Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Sandra Loosemore

On 11/15/2017 04:51 AM, Tamar Christina wrote:

Hi All,

This patch updates the documentation for AArch64 and ARM correcting the use of 
the
architecture namings by adding the -A suffix in appropriate places.


Just to clarify, was the documentation previously using incorrect 
terminology, or are there new non-A ARMv7 and ARMv8 architectures that 
invalidate existing uses of those terms without the -A suffix?  And, are 
the "appropriate places" all currently-unsuffixed uses, or just a subset 
of incorrect uses?


The actual patch looks like search-and-replace to me and I have no 
objection to it, but I'd like to understand the rationale so that I can 
try to remember what the conventions are for future patch review


-Sandra


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Jonathan Wakely

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.



[PATCH, GCC/ARM] Fix ICE in Armv8-M Security Extensions code

2017-11-15 Thread Thomas Preudhomme

Hi,

Commit r253825 which introduced some sanity checks for sbitmap revealed
a bug in the conversion of cmse_nonsecure_entry_clear_before_return ()
to using bitmap structure. bitmap_and expects that the two bitmaps have
the same length, yet the code in
cmse_nonsecure_entry_clear_before_return () have different size for
to_clear_bitmap and to_clear_arg_regs_bitmap, with the assumption that
bitmap_and would behave has if the bits not allocated were in fact zero.
This commit makes sure both bitmap are equally sized.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-13  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_entry_clear_before_return): Allocate
to_clear_arg_regs_bitmap to the same size as to_clear_bitmap.

Testing: Bootstrapped GCC on arm-none-linux-gnueabihf target and
testsuite shows no regression. Running cmse.exp tests for Armv8-M
Baseline and Mainline shows FAIL->PASS for bitfield-1, bitfield-2,
bitfield-3 and struct-1 testcases.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index db99303f3fb7a2196f48358e74fa4d98f31f045e..106e3edce0d6f2518eb391c436c5213a78d1275b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25205,7 +25205,8 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx reg_rtx;
-  auto_sbitmap to_clear_arg_regs_bitmap (R0_REGNUM + NUM_ARG_REGS);
+  int to_clear_bitmap_size = SBITMAP_SIZE ((sbitmap) to_clear_bitmap);
+  auto_sbitmap to_clear_arg_regs_bitmap (to_clear_bitmap_size);
 
   /* Padding bits to clear is not 0 so we know we are dealing with
 	 returning a composite type, which only uses r0.  Let's make sure that


Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread H.J. Lu
On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak  wrote:
> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
>> -mzeroupper is specified to generate vzeroupper instruction.  If it
>> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
>> always use -mzeroupper or -mno-zeroupper to override it.
>>
>> Sebastian, can you run the full test with it?
>>
>> OK for trunk if there is no regression?
>
> If we want to go this way, please add relevant tune flag (e.g.
> X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
> property of the processor model, not ISA.

How about this?  OK for trunk if there are no regressions?


-- 
H.J.
From d9388c1b7f36e2310645aed4a4debefa65b5129e Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 14 Nov 2017 20:49:33 -0800
Subject: [PATCH] i386: Add X86_TUNE_EMIT_VZEROUPPER

Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should
be inserted before a transfer of control flow out of the function.  It is
turned on by default unless we are tuning for KNL.  Users can always use
-mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER.

gcc/

	PR target/82990
	* config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
	TARGET_AVX512ER check.
	(ix86_option_override_internal): Set MASK_VZEROUPPER if
	neither -mzeroupper nor -mno-zeroupper is used and
	TARGET_EMIT_VZEROUPPER is set.
	* config/i386/i386.h (TARGET_EMIT_VZEROUPPER): New.
	* config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER.

gcc/testsuite/

	PR target/82990
	* gcc.target/i386/pr82942-2.c: Add -mtune=knl.
	* gcc.target/i386/pr82990-1.c: New test.
	* gcc.target/i386/pr82990-2.c: Likewise.
	* gcc.target/i386/pr82990-3.c: Likewise.
	* gcc.target/i386/pr82990-4.c: Likewise.
	* gcc.target/i386/pr82990-5.c: Likewise.
	* gcc.target/i386/pr82990-6.c: Likewise.
	* gcc.target/i386/pr82990-7.c: Likewise.
---
 gcc/config/i386/i386.c|  5 +++--
 gcc/config/i386/i386.h|  2 ++
 gcc/config/i386/x86-tune.def  |  4 
 gcc/testsuite/gcc.target/i386/pr82942-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr82990-1.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-2.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-3.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-4.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-5.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr82990-6.c |  6 ++
 gcc/testsuite/gcc.target/i386/pr82990-7.c |  6 ++
 11 files changed, 68 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr82990-7.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5e84a09954..c6ca0712755 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2497,7 +2497,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return TARGET_AVX && !TARGET_AVX512ER
+  return TARGET_AVX
 	 && TARGET_VZEROUPPER && flag_expensive_optimizations
 	 && !optimize_size;
 }
@@ -4666,7 +4666,8 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_SEH && TARGET_CALL_MS2SYSV_XLOGUES)
 sorry ("-mcall-ms2sysv-xlogues isn%'t currently supported with SEH");
 
-  if (!(opts_set->x_target_flags & MASK_VZEROUPPER))
+  if (!(opts_set->x_target_flags & MASK_VZEROUPPER)
+  && TARGET_EMIT_VZEROUPPER)
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e3e55da4232..a45e2df5783 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -517,6 +517,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 	ix86_tune_features[X86_TUNE_AVOID_FALSE_DEP_FOR_BMI]
 #define TARGET_ONE_IF_CONV_INSN \
 	ix86_tune_features[X86_TUNE_ONE_IF_CONV_INSN]
+#define TARGET_EMIT_VZEROUPPER \
+	ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER]
 
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 99282c88341..19fd2b52b30 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -543,3 +543,7 @@ DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~0U)
arithmetic to 32bit via PROMOTE_MODE macro.  This code generation scheme
is usually used for RISC targets.  */
 DEF_TUNE (X86_TUNE_PROMOTE_QI_REGS, "promote_qi_regs", 0U)
+
+/* X86_TUNE_EMIT_VZEROUPPER: This enables vzeroupper instruction insertion
+   before a transfer of control flow out of the function.  */
+DEF_TUNE (X86_TUNE_EMIT_VZ

[PATCH, GCC/testsuite/ARM] Fix selection of effective target for cmse tests

2017-11-15 Thread Thomas Preudhomme

Hi,

Some of the tests in the gcc.target/arm/cmse directory (eg.
gcc.target/arm/cmse/mainline/bitfield-4.c) are failing when run without
an architecture specified in RUNTESTFLAGS due to them not adding the
option to select an Armv8-M architecture.

This patch fixes the issue by adding the right option from the exp file
so that no architecture fiddling is necessary in the individual tests.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-11-03  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Add option to select Armv8-M Baseline
or Armv8-M Mainline when running the respective tests.
* gcc.target/arm/cmse/baseline/cmse-11.c: Remove architecture check and
selection.
* gcc.target/arm/cmse/baseline/cmse-13.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-2.c: Likewise.
* gcc.target/arm/cmse/baseline/cmse-6.c: Likewise.
* gcc.target/arm/cmse/baseline/softfp.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.

Testing: Running cmse.exp for both Armv8-M Baseline and Mainline shows
no regression. Running it for a toolchain defaulting to Armv8-M Baseline
but with RUNTESTFLAGS unset sees some FAIL->PASS.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
index 795544fe11d9d7f24086be16916a5bfee89d7b44..230b255963f56a6c29b91d2501b43fed6eda2476 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (int);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
index 7208a2cedd2f4f8296b2801d6f5e5d7838b26551..7ab3219e860e993e2eca3bbee2e885f59b7b3cb4 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" } */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 #include "../cmse-13.x"
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
index fec7dc10484b14db5796f5f431a9306c3b2e307c..d5115ecf2bdb3e87dc6a92244cb204e753f25b07 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-2.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 extern float bar (void);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
index 43d45e7a63e56edfebc203c8f0e516dc13fbbd65..cae4f343621d1a19a8893ea4950d33e5e1842fb5 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c
@@ -1,7 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-mcmse" }  */
-/* { dg-require-effective-target arm_arch_v8m_base_ok } */
-/* { dg-add-options arm_arch_v8m_base } */
 
 int __attribute__ ((cmse_nonsecure_call)) (*bar) (double);
 
diff --git a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c b/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
index ca76e12cd9287fd12b7eb7add638973f5d314939..3d383ff6ee17677120e3e1e81726785c30f3b25c 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/baseline/softfp.c
+++ b/gcc/testsuite/gcc.target/arm/cm

RE: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Tamar Christina
Hi Sandra,

> -Original Message-
> From: Sandra Loosemore [mailto:san...@codesourcery.com]
> Sent: Wednesday, November 15, 2017 16:38
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; James Greenhalgh ;
> Richard Earnshaw ; Marcus Shawcroft
> 
> Subject: Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates
> adding -A extensions.
> 
> On 11/15/2017 04:51 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch updates the documentation for AArch64 and ARM correcting
> > the use of the architecture namings by adding the -A suffix in appropriate
> places.
> 
> Just to clarify, was the documentation previously using incorrect terminology,
> or are there new non-A ARMv7 and ARMv8 architectures that invalidate
> existing uses of those terms without the -A suffix? 

Yes, there are the -M and -R suffixes/profiles. A lot of the documentation was 
written
before these existed. It is mainly a find and replace, but I tried to determine 
for each
change whether the instructions exist in the other profiles. Hopefully they'll 
all correct
but I'll leave that for the review.

> And, are the "appropriate
> places" all currently-unsuffixed uses, or just a subset of incorrect uses?
> 

It turned out I had to change all of them, for AArch64 for instance we only 
have A profile.
Which is why all unsuffixes changed to -A. For Aarch32 the explicitly different 
stuff
Already had the correct suffixes, so I changed the rest to -A as well.

Tamar.

> The actual patch looks like search-and-replace to me and I have no objection
> to it, but I'd like to understand the rationale so that I can try to remember
> what the conventions are for future patch review
> 
> -Sandra


Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Martin Sebor

On 11/15/2017 09:38 AM, Jonathan Wakely wrote:

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite
follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.


Right, /constexpr function/ is a defined term (as is /constexpr
cosntructor/ and /constexpr if/).  As you say, its defining
occurrence is italicized in the text, and the rest aren't.
In contrast, in terms like "constexpr specifier," "constexpr"
is the keyword and it's always in monospace.

The challenge in GCC as I see it is to know how to decide which
of the two it is.  The difference between constexpr the keyword
and constexpr as part of a defined term is too subtle for most
people who don't work with the standard for a living.  So we end
up with these minor inconsistencies in the diagnostics.  I think
the easiest way to achieve consistency (in diagnostics) it is to
always quote keywords.  Having italics would be a nice touch but
it would probably not improve consistency.

Martin

PS I was looking at the February 2017 draft.  The October version
looks quite a bit better.



[PATCH, GCC/testsuite/ARM] Rework expectation for call to Armv8-M nonsecure function

2017-11-15 Thread Thomas Preudhomme

Hi,

Testcase gcc.target/arm/cmse/cmse-14.c checks whether bar is called via
__gnu_cmse_nonsecure_call libcall and not via a direct call. However the
pattern is a bit surprising in that it needs to explicitely allow "by"
due to allowing anything before the 'b'.

This patch rewrites the logic to look for b as a first non-whitespace
letter followed iby anything (to match bl and conditional branches)
followed by some spaces and then bar.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-11-01  Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-14.c: Change logic to match branch
instruction to bar.

Testing: Test still passes for both Armv8-M Baseline and Mainline.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
index 701e9ee7e318a07278099548f9b7042a1fde1204..df1ea52bec533c36a738d7d3b2b2ff749b0f3713 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-14.c
@@ -10,4 +10,4 @@ int foo (void)
 }
 
 /* { dg-final { scan-assembler "bl\t__gnu_cmse_nonsecure_call" } } */
-/* { dg-final { scan-assembler-not "b\[^ y\n\]*\\s+bar" } } */
+/* { dg-final { scan-assembler-not "^(.*\\s)?bl?\[^\\s]*\\s+bar" } } */


Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> g++.dg/lambda/lambda-switch.C Has recently regressed.  

g++.dg/cpp0x/lambda/lambda-switch.C

> It appears the 
> location of a warning message has moved.
> 
> l = []()  // { dg-warning "statement will never
> be executed" }
>   {
>   case 3: // { dg-error "case" }
> break;// { dg-error "break" }
>   };  <--- warning now here
> 
> We seem to be diagnosing the last line of the statement, not the
> first. 
> That seems not a useful.
> 
> I've not investigated what patch may have caused this, on the chance 
> someone might already know?
> 
> nathan

The warning was added in r236597 (aka
1398da0f786e120bb0b407e84f412aa9fc6d80ee):

+2016-05-23  Marek Polacek  
+
+   PR c/49859
+   * common.opt (Wswitch-unreachable): New option.
+   * doc/invoke.texi: Document -Wswitch-unreachable.
+   * gimplify.c (gimplify_switch_expr): Implement the -Wswitch-unreachable
+   warning.

which had it at there (23:7).

r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
warning from 23:7 up to the "[] ()" at 19:6 in:

+2017-01-20  Marek Polacek  
+
+   PR c/64279
[...snip...]
+   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.

I tried it with some working copies I have to hand:
- works for me with r254387 (2017-11-03)
- fails for me with r254700 (2017-11-13)

so hopefully that helps track it down.

Dave


[PATCH, GCC/ARM] Use bitmap to control cmse_nonsecure_call register clearing

2017-11-15 Thread Thomas Preudhomme

Hi,

As part of r253256, cmse_nonsecure_entry_clear_before_return has been
rewritten to use auto_sbitmap instead of an integer bitfield to control
which register needs to be cleared. This commit continue this work in
cmse_nonsecure_call_clear_caller_saved.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-16  Thomas Preud'homme  

* config/arm/arm.c (cmse_nonsecure_call_clear_caller_saved): Use
auto_sbitap instead of integer bitfield to control register needing
clearing.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9919f54242d9317125a104f9777d76a85de80e9b..7384b96fea0179334a6010b099df68c8e2a0fc32 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16990,10 +16990,11 @@ cmse_nonsecure_call_clear_caller_saved (void)
 
   FOR_BB_INSNS (bb, insn)
 	{
-	  uint64_t to_clear_mask, float_mask;
+	  unsigned address_regnum, regno, maxregno =
+	TARGET_HARD_FLOAT_ABI ? D7_VFP_REGNUM : NUM_ARG_REGS - 1;
+	  auto_sbitmap to_clear_bitmap (maxregno + 1);
 	  rtx_insn *seq;
 	  rtx pat, call, unspec, reg, cleared_reg, tmp;
-	  unsigned int regno, maxregno;
 	  rtx address;
 	  CUMULATIVE_ARGS args_so_far_v;
 	  cumulative_args_t args_so_far;
@@ -17024,18 +17025,21 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	continue;
 
 	  /* Determine the caller-saved registers we need to clear.  */
-	  to_clear_mask = (1LL << (NUM_ARG_REGS)) - 1;
-	  maxregno = NUM_ARG_REGS - 1;
+	  bitmap_clear (to_clear_bitmap);
+	  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+
 	  /* Only look at the caller-saved floating point registers in case of
 	 -mfloat-abi=hard.  For -mfloat-abi=softfp we will be using the
 	 lazy store and loads which clear both caller- and callee-saved
 	 registers.  */
 	  if (TARGET_HARD_FLOAT_ABI)
 	{
-	  float_mask = (1LL << (D7_VFP_REGNUM + 1)) - 1;
-	  float_mask &= ~((1LL << FIRST_VFP_REGNUM) - 1);
-	  to_clear_mask |= float_mask;
-	  maxregno = D7_VFP_REGNUM;
+	  auto_sbitmap float_bitmap (maxregno + 1);
+
+	  bitmap_clear (float_bitmap);
+	  bitmap_set_range (float_bitmap, FIRST_VFP_REGNUM,
+D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1);
+	  bitmap_ior (to_clear_bitmap, to_clear_bitmap, float_bitmap);
 	}
 
 	  /* Make sure the register used to hold the function address is not
@@ -17043,7 +17047,9 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  address = RTVEC_ELT (XVEC (unspec, 0), 0);
 	  gcc_assert (MEM_P (address));
 	  gcc_assert (REG_P (XEXP (address, 0)));
-	  to_clear_mask &= ~(1LL << REGNO (XEXP (address, 0)));
+	  address_regnum = REGNO (XEXP (address, 0));
+	  if (address_regnum < R0_REGNUM + NUM_ARG_REGS)
+	bitmap_clear_bit (to_clear_bitmap, address_regnum);
 
 	  /* Set basic block of call insn so that df rescan is performed on
 	 insns inserted here.  */
@@ -17064,6 +17070,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
 	{
 	  rtx arg_rtx;
+	  uint64_t to_clear_args_mask;
 	  machine_mode arg_mode = TYPE_MODE (arg_type);
 
 	  if (VOID_TYPE_P (arg_type))
@@ -17076,10 +17083,18 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type,
 	  true);
 	  gcc_assert (REG_P (arg_rtx));
-	  to_clear_mask
-		&= ~compute_not_to_clear_mask (arg_type, arg_rtx,
-	   REGNO (arg_rtx),
-	   padding_bits_to_clear_ptr);
+	  to_clear_args_mask
+		= compute_not_to_clear_mask (arg_type, arg_rtx,
+	 REGNO (arg_rtx),
+	 padding_bits_to_clear_ptr);
+	  if (to_clear_args_mask)
+		{
+		  for (regno = R0_REGNUM; regno <= maxregno; regno++)
+		{
+		  if (to_clear_args_mask & (1ULL << regno))
+			bitmap_clear_bit (to_clear_bitmap, regno);
+		}
+		}
 
 	  first_param = false;
 	}
@@ -17138,7 +17153,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	 call.  */
 	  for (regno = R0_REGNUM; regno <= maxregno; regno++)
 	{
-	  if (!(to_clear_mask & (1LL << regno)))
+	  if (!bitmap_bit_p (to_clear_bitmap, regno))
 		continue;
 
 	  /* If regno is an even vfp register and its successor is also to
@@ -17147,7 +17162,7 @@ cmse_nonsecure_call_clear_caller_saved (void)
 		{
 		  if (TARGET_VFP_DOUBLE
 		  && VFP_REGNO_OK_FOR_DOUBLE (regno)
-		  && to_clear_mask & (1LL << (regno + 1)))
+		  && bitmap_bit_p (to_clear_bitmap, (regno + 1)))
 		emit_move_insn (gen_rtx_REG (DFmode, regno++),
 CONST0_RTX (DFmode));
 		  else
@@ -17161,7 +17176,6 @@ cmse_nonsecure_call_clear_caller_saved (void)
 	  seq = get_insns ();
 	  end_sequence ();
 	  emit_insn_before (seq, insn);
-
 	}
 }
 }
@@ -25188,7 +25202,7 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (padding_bits_to_clear != 0)
 {
   rtx

Re: [PATCH][RFC] Add quotes for constexpr keyword.

2017-11-15 Thread Jonathan Wakely

On 15/11/17 10:04 -0700, Martin Sebor wrote:

On 11/15/2017 09:38 AM, Jonathan Wakely wrote:

On 15/11/17 09:30 -0700, Martin Sebor wrote:

On 11/15/2017 05:45 AM, Martin Liška wrote:

On 11/06/2017 07:29 PM, Martin Sebor wrote:

Sorry for being late with my comment.  I just spotted this minor
formatting issue.  Even though GCC isn't (yet) consistent about
it the keyword "constexpr" should be quoted in the error message
below (and, eventually, in all diagnostic messages).  Since the
patch has been committed by now this is just a reminder for us
to try to keep this in mind in the future.


Hi.

I've prepared patch for that. If it's desired, I can fix test-suite
follow-up.
Do we want to change it also for error messages like:
"call to non-constexpr function"
"constexpr call flows off the end of the function"


If GCC had support for italics for defined terms of the language
or the grammar /constexpr function/ would be italicized because
it's a defined term.  Absent that, I think I would quote them all
for consistency.

Martin

PS I checked the C++ standard to see how it used the term and
the choices it makes seem pretty arbitrary.  There are even
sentences with two instances of two word, one in fixed width
font and the other in proportional.  So I don't think we can
use the spec as an example to follow.


Did you check the latest draft? That should have been fixed.

Defined terms should only be italicized when introduced, not when
used, e.g. in [dcl.constexpr] p2 "constexpr function" and "constexpr
constructor" are italicized, but are in normal font elsewhere. When
referring specifically to the keyword `constexpr` it should be in code
font.

Grammar productions are always italicized, but "constexpr function" is
not a grammar production.


Right, /constexpr function/ is a defined term (as is /constexpr
cosntructor/ and /constexpr if/).  As you say, its defining
occurrence is italicized in the text, and the rest aren't.
In contrast, in terms like "constexpr specifier," "constexpr"
is the keyword and it's always in monospace.

The challenge in GCC as I see it is to know how to decide which
of the two it is.  The difference between constexpr the keyword
and constexpr as part of a defined term is too subtle for most
people who don't work with the standard for a living.  So we end
up with these minor inconsistencies in the diagnostics.  I think
the easiest way to achieve consistency (in diagnostics) it is to
always quote keywords.


Agreed. GCC also doesn't need to distinguish between the definition
and use of a standard term, it is always using the term, so can always
format it the same way.


Having italics would be a nice touch but
it would probably not improve consistency.



PS I was looking at the February 2017 draft.  The October version
looks quite a bit better.


These were the relevant fixes:
https://github.com/cplusplus/draft/issues/559
https://github.com/cplusplus/draft/issues/825
https://github.com/cplusplus/draft/pull/1153
https://github.com/cplusplus/draft/pull/1484

We've been trying to be more consistent about these kind of formatting
issues in the standard, as it was a bit of a mess.



[PATCH, GCC/ARM] Factor out CMSE register clearing code

2017-11-15 Thread Thomas Preudhomme

Hi,

Functions cmse_nonsecure_call_clear_caller_saved and
cmse_nonsecure_entry_clear_before_return both contain very similar code
to clear registers. What's worse, they differ slightly at times so if a
bug is found in one careful thoughts is needed to decide whether the
other function needs fixing too.

This commit addresses the situation by factoring the two pieces of code
into a new function. In doing so the code generated to clear VFP
registers in cmse_nonsecure_call now uses the same sequence as
cmse_nonsecure_entry functions. Tests expectation are thus updated
accordingly.

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.c (cmse_clear_registers): New function.
(cmse_nonsecure_call_clear_caller_saved): Replace register clearing
code by call to cmse_clear_registers.
(cmse_nonsecure_entry_clear_before_return): Likewise.

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Adapt expectations
to vmov instructions now generated.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and no regression in the
testsuite.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9b494e9529a4470c18192a4561e03d2f80e90797..22c9add0722974902b2a89b2b0a75759ff8ba37c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -16991,6 +16991,128 @@ compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
   return not_to_clear_mask;
 }
 
+/* Clear registers secret before doing a cmse_nonsecure_call or returning from
+   a cmse_nonsecure_entry function.  TO_CLEAR_BITMAP indicates which registers
+   are to be fully cleared, using the value in register CLEARING_REG if more
+   efficient.  The PADDING_BITS_LEN entries array PADDING_BITS_TO_CLEAR gives
+   the bits that needs to be cleared in caller-saved core registers, with
+   SCRATCH_REG used as a scratch register for that clearing.
+
+   NOTE: one of three following assertions must hold:
+   - SCRATCH_REG is a low register
+   - CLEARING_REG is in the set of registers fully cleared (ie. its bit is set
+ in TO_CLEAR_BITMAP)
+   - CLEARING_REG is a low register.  */
+
+static void
+cmse_clear_registers (sbitmap to_clear_bitmap, uint32_t *padding_bits_to_clear,
+		  int padding_bits_len, rtx scratch_reg, rtx clearing_reg)
+{
+  bool saved_clearing = false;
+  rtx saved_clearing_reg = NULL_RTX;
+  int i, regno, clearing_regno, minregno = R0_REGNUM, maxregno = minregno - 1;
+
+  gcc_assert (arm_arch_cmse);
+
+  if (!bitmap_empty_p (to_clear_bitmap))
+{
+  minregno = bitmap_first_set_bit (to_clear_bitmap);
+  maxregno = bitmap_last_set_bit (to_clear_bitmap);
+}
+  clearing_regno = REGNO (clearing_reg);
+
+  /* Clear padding bits.  */
+  gcc_assert (padding_bits_len <= NUM_ARG_REGS);
+  for (i = 0, regno = R0_REGNUM; i < padding_bits_len; i++, regno++)
+{
+  uint64_t mask;
+  rtx rtx16, dest, cleared_reg = gen_rtx_REG (SImode, regno);
+
+  if (padding_bits_to_clear[i] == 0)
+	continue;
+
+  /* If this is a Thumb-1 target and SCRATCH_REG is not a low register, use
+	 CLEARING_REG as scratch.  */
+  if (TARGET_THUMB1
+	  && REGNO (scratch_reg) > LAST_LO_REGNUM)
+	{
+	  /* clearing_reg is not to be cleared, copy its value into scratch_reg
+	 such that we can use clearing_reg to clear the unused bits in the
+	 arguments.  */
+	  if ((clearing_regno > maxregno
+	   || !bitmap_bit_p (to_clear_bitmap, clearing_regno))
+	  && !saved_clearing)
+	{
+	  gcc_assert (clearing_regno <= LAST_LO_REGNUM);
+	  emit_move_insn (scratch_reg, clearing_reg);
+	  saved_clearing = true;
+	  saved_clearing_reg = scratch_reg;
+	}
+	  scratch_reg = clearing_reg;
+	}
+
+  /* Fill the lower half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) & 0x;
+  emit_move_insn (scratch_reg, gen_int_mode (mask, SImode));
+
+  /* Fill the top half of the negated padding_bits_to_clear[i].  */
+  mask = (~padding_bits_to_clear[i]) >> 16;
+  rtx16 = gen_int_mode (16, SImode);
+  dest = gen_rtx_ZERO_EXTRACT (SImode, scratch_reg, rtx16, rtx16);
+  if (mask)
+	emit_insn (gen_rtx_SET (dest, gen_int_mode (mask, SImode)));
+
+  emit_insn (gen_andsi3 (cleared_reg, cleared_reg, scratch_reg));
+}
+  if (saved_clearing)
+emit_move_insn (clearing_reg, saved_clearing_reg);
+
+
+  /* Clear full registers.  */
+
+  /* If not marked for clearing, clearing_reg already does not contain
+ any secret.  */
+  if (clearing_regno <= ma

[PATCH, GCC/ARM] Do no clobber r4 in Armv8-M nonsecure call

2017-11-15 Thread Thomas Preudhomme

Hi,

Expanders for Armv8-M nonsecure call unnecessarily clobber r4 despite
the libcall they perform not writing to r4.  Furthermore, the
requirement for the branch target address to be in r4 as expected by
the libcall is modeled in a convoluted way in the define_insn patterns:
the address is a register match_operand constrained by the match_dup
for the clobber which is guaranteed to be r4 due to the expander.

This patch simplifies all this by simply requiring the address to be in
r4 and removing the clobbers. Expanders are left alone because
cmse_nonsecure_call_clear_caller_saved relies on branch target memory
attributes which would be lost if expanding to reg:SI R4_REGNUM.

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2017-10-24  Thomas Preud'homme  

* config/arm/arm.md (R4_REGNUM): Define constant.
(nonsecure_call_internal): Remove r4 clobber.
(nonsecure_call_value_internal): Likewise.
* config/arm/thumb1.md (nonsecure_call_reg_thumb1_v5): Remove second
clobber and resequence match_operands.
(nonsecure_call_value_reg_thumb1_v5): Likewise.
* config/arm/thumb2.md (nonsecure_call_reg_thumb2): Likewise.
(nonsecure_call_value_reg_thumb2): Likewise.

Testing: Bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ddb9d8f359007c1d86d497aef0ff5fc0e4061813..6b0794ede9fbc5a4f41e1f4a92acb9b649a277bc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -30,6 +30,7 @@
 (define_constants
   [(R0_REGNUM 0)	; First CORE register
(R1_REGNUM	  1)	; Second CORE register
+   (R4_REGNUM	  4)	; Fifth CORE register
(IP_REGNUM	 12)	; Scratch register
(SP_REGNUM	 13)	; Stack pointer
(LR_REGNUM14)	; Return address register
@@ -8118,14 +8119,13 @@
 			   UNSPEC_NONSECURE_MEM)
 		(match_operand 1 "general_operand" ""))
 	  (use (match_operand 2 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[0], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[0] = replace_equiv_address (operands[0], tmp);
@@ -8210,14 +8210,13 @@
 UNSPEC_NONSECURE_MEM)
 			 (match_operand 2 "general_operand" "")))
 	  (use (match_operand 3 "" ""))
-	  (clobber (reg:SI LR_REGNUM))
-	  (clobber (reg:SI 4))])]
+	  (clobber (reg:SI LR_REGNUM))])]
   "use_cmse"
   "
   {
 rtx tmp;
 tmp = copy_to_suggested_reg (XEXP (operands[1], 0),
- gen_rtx_REG (SImode, 4),
+ gen_rtx_REG (SImode, R4_REGNUM),
  SImode);
 
 operands[1] = replace_equiv_address (operands[1], tmp);
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 5d196a673355a7acf7d0ed30f21b997b815913f5..f91659386bf240172bd9a3076722683c8a50dff4 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1732,12 +1732,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb1_v5"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "register_operand" "l*r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse && !SIBLING_CALL_P (insn)"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
@@ -1779,12 +1778,11 @@
 (define_insn "*nonsecure_call_value_reg_thumb1_v5"
   [(set (match_operand 0 "" "")
 	(call (unspec:SI
-	   [(mem:SI (match_operand:SI 1 "register_operand" "l*r"))]
+	   [(mem:SI (reg:SI R4_REGNUM))]
 	   UNSPEC_NONSECURE_MEM)
-	  (match_operand 2 "" "")))
-   (use (match_operand 3 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 1))]
+	  (match_operand 1 "" "")))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB1 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_call"
   [(set_attr "length" "4")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 776d611d2538e790a5f504995050ffdfc51d7193..d56a8bd167575263edc2a4b3f66bda34a4a7a72a 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -555,12 +555,11 @@
 )
 
 (define_insn "*nonsecure_call_reg_thumb2"
-  [(call (unspec:SI [(mem:SI (match_operand:SI 0 "s_register_operand" "r"))]
+  [(call (unspec:SI [(mem:SI (reg:SI R4_REGNUM))]
 		UNSPEC_NONSECURE_MEM)
-	 (match_operand 1 "" ""))
-   (use (match_operand 2 "" ""))
-   (clobber (reg:SI LR_REGNUM))
-   (clobber (match_dup 0))]
+	 (match_operand 0 "" ""))
+   (use (match_operand 1 "" ""))
+   (clobber (reg:SI LR_REGNUM))]
   "TARGET_THUMB2 && use_cmse"
   "bl\\t__gnu_cmse_nonsecure_ca

Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-15 Thread Sandra Loosemore

On 11/15/2017 10:00 AM, Tamar Christina wrote:


On 11/15/2017 04:51 AM, Tamar Christina wrote:

Hi All,

This patch updates the documentation for AArch64 and ARM correcting
the use of the architecture namings by adding the -A suffix in appropriate

places.

Just to clarify, was the documentation previously using incorrect terminology,
or are there new non-A ARMv7 and ARMv8 architectures that invalidate
existing uses of those terms without the -A suffix?


Yes, there are the -M and -R suffixes/profiles. A lot of the documentation was 
written
before these existed. It is mainly a find and replace, but I tried to determine 
for each
change whether the instructions exist in the other profiles. Hopefully they'll 
all correct
but I'll leave that for the review.


OK.  I have no objection to the patch from a documentation point of 
view, but I'll defer to the port maintainers for technical review.


-Sandra


Re: [PATCH, rs6000] Repair vec_xl, vec_xst, vec_xl_be, vec_xst_be built-in functions

2017-11-15 Thread Segher Boessenkool
Hi!

On Tue, Nov 14, 2017 at 02:24:13PM -0600, Bill Schmidt wrote:
> +  for (i = 0; i < 16; ++i)
> + perm[i] = GEN_INT (reorder[i]);
> +
> +  pcv = force_reg (V16QImode,
> +   gen_rtx_CONST_VECTOR (V16QImode,
> +  gen_rtvec_v (16, perm)));
> +  emit_insn (gen_altivec_vperm_v8hi_direct (operands[0], subreg2,
> + subreg2, pcv));
> +  DONE;

Many whitespace problems on these lines, please fix.  More times later.

> +   (match_operand:V16QI 1 "vsx_register_operand" "wa")
> +   (parallel [(const_int 15) (const_int 14)
> +  (const_int 13) (const_int 12)
> +  (const_int 11) (const_int 10)
> +  (const_int  9) (const_int  8)
> +  (const_int  7) (const_int  6)
> +  (const_int  5) (const_int  4)
> +  (const_int  3) (const_int  2)
> +  (const_int  1) (const_int  0)])))]

Here, too.

The rest looks fine.  Thanks!


Segher


Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:
> On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> > g++.dg/lambda/lambda-switch.C Has recently regressed.  
> 
> g++.dg/cpp0x/lambda/lambda-switch.C
> 
> > It appears the 
> > location of a warning message has moved.
> > 
> >   l = []()  // { dg-warning "statement will never
> > be executed" }
> > {
> > case 3: // { dg-error "case" }
> >   break;// { dg-error "break" }
> > };  <--- warning now here
> > 
> > We seem to be diagnosing the last line of the statement, not the
> > first. 
> > That seems not a useful.
> > 
> > I've not investigated what patch may have caused this, on the
> > chance 
> > someone might already know?
> > 
> > nathan
> 
> The warning was added in r236597 (aka
> 1398da0f786e120bb0b407e84f412aa9fc6d80ee):
> 
> +2016-05-23  Marek Polacek  
> +
> +   PR c/49859
> +   * common.opt (Wswitch-unreachable): New option.
> +   * doc/invoke.texi: Document -Wswitch-unreachable.
> +   * gimplify.c (gimplify_switch_expr): Implement the -Wswitch-
> unreachable
> +   warning.
> 
> which had it at there (23:7).
> 
> r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
> warning from 23:7 up to the "[] ()" at 19:6 in:
> 
> +2017-01-20  Marek Polacek  
> +
> +   PR c/64279
> [...snip...]
> +   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.
> 
> I tried it with some working copies I have to hand:
> - works for me with r254387 (2017-11-03)
> - fails for me with r254700 (2017-11-13)
> 
> so hopefully that helps track it down.
> 
> Dave

Searching in the November archives of the gcc-regression ML for
"lambda-switch.c":

https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd&form=extended&m=all&s=D&q=lambda-switch.c&ul=%2Fml%2Fgcc-regression%2F2017-11%2F%25

showed e.g.:
  https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
   "Regressions on trunk at revision 254648 vs revision 254623"

which says this is a new failure somewhere in that range; so it
presumably happened sometime on 2017-11-10 after r254623 and up to
(maybe ==) r254648.

Looking at:
   svn log -r r254623:r254648 |less
nothing jumps out at me as being related.

Hope this is helpful
Dave


Re: lambda-switch regression

2017-11-15 Thread Martin Sebor

On 11/15/2017 06:03 AM, Nathan Sidwell wrote:

g++.dg/lambda/lambda-switch.C Has recently regressed.  It appears the
location of a warning message has moved.

  l = []()// { dg-warning "statement will never be executed" }
{
case 3:// { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the first.
That seems not a useful.

I've not investigated what patch may have caused this, on the chance
someone might already know?


Bug 82988 points to my r254630 as the commit that triggered it.
I haven't yet looked into it.  There some small chance that it
was caused by bug 82977 that Jakub just fixed.

Martin



Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-15 Thread Wilco Dijkstra
Sandra Loosemore wrote:

> I'd prefer that you remove the reference to configure options entirely 
> here.  Nowadays most GCC users install a package provided by their OS 
> distribution, Linaro, etc, rather than trying to build GCC from scratch.

OK, I've removed that reference. Similarly the FRAME_POINTER_REQUIRED
bit as that statement is not only irrelevant but also completely incorrect.

> > +Enabled at levels @option{-O}, @option{-O1}, @option{-O2}, @option{-O3},
> > +@option{-Os} and @option{-Og}.
>
> This last sentence makes no sense.  If the option is now enabled by 
> default, then the optimization level is irrelevant.

It's enabled from -O onwards, so I've changed it to the standard form used
elsewhere and updated the table for -O:

+Enabled by default at @option{-O} and higher.

Here is the cleaned up and simplified version:


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
2ef88e081f982f5619132cc33ce23c3fb542ae11..158c9ae3f1297a1265fc974cd3e6825d8f5be096
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7258,6 +7258,7 @@ compilation time.
 -fipa-reference @gol
 -fmerge-constants @gol
 -fmove-loop-invariants @gol
+-fomit-frame-pointer @gol
 -freorder-blocks @gol
 -fshrink-wrap @gol
 -fshrink-wrap-separate @gol
@@ -7282,9 +7283,6 @@ compilation time.
 -ftree-ter @gol
 -funit-at-a-time}
 
-@option{-O} also turns on @option{-fomit-frame-pointer} on machines
-where doing so does not interfere with debugging.
-
 @item -O2
 @opindex O2
 Optimize even more.  GCC performs nearly all supported optimizations
@@ -7436,29 +7434,18 @@ The default is @option{-ffp-contract=fast}.
 
 @item -fomit-frame-pointer
 @opindex fomit-frame-pointer
-Don't keep the frame pointer in a register for functions that
-don't need one.  This avoids the instructions to save, set up and
-restore frame pointers; it also makes an extra register available
-in many functions.  @strong{It also makes debugging impossible on
-some machines.}
-
-On some machines, such as the VAX, this flag has no effect, because
-the standard calling sequence automatically handles the frame pointer
-and nothing is saved by pretending it doesn't exist.  The
-machine-description macro @code{FRAME_POINTER_REQUIRED} controls
-whether a target machine supports this flag.  @xref{Registers,,Register
-Usage, gccint, GNU Compiler Collection (GCC) Internals}.
-
-The default setting (when not optimizing for
-size) for 32-bit GNU/Linux x86 and 32-bit Darwin x86 targets is
-@option{-fomit-frame-pointer}.  You can configure GCC with the
-@option{--enable-frame-pointer} configure option to change the default.
-
-Note that @option{-fno-omit-frame-pointer} doesn't force a new stack
-frame for all functions if it isn't otherwise needed, and hence doesn't
-guarantee a new frame pointer for all functions.
+Omit the frame pointer in functions that don't need one.  This avoids the
+instructions to save, set up and restore the frame pointer; on many targets
+it also makes an extra register available.
 
-Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}.
+On some targets this flag has no effect because the standard calling sequence
+always uses a frame pointer, so it cannot be omitted.
+
+Note that @option{-fno-omit-frame-pointer} doesn't guarantee the frame pointer
+is used in all functions.  Several targets always omit the frame pointer in
+leaf functions.
+
+Enabled by default at @option{-O} and higher.
 
 @item -foptimize-sibling-calls
 @opindex foptimize-sibling-calls
@@ -16753,9 +16740,7 @@ Certain other options, such as 
@option{-mid-shared-library} and
 @opindex momit-leaf-frame-pointer
 Don't keep the frame pointer in a register for leaf functions.  This
 avoids the instructions to save, set up and restore frame pointers and
-makes an extra register available in leaf functions.  The option
-@option{-fomit-frame-pointer} removes the frame pointer for all functions,
-which might make debugging harder.
+makes an extra register available in leaf functions.
 
 @item -mspecld-anomaly
 @opindex mspecld-anomaly


Re: lambda-switch regression

2017-11-15 Thread David Malcolm
On Wed, 2017-11-15 at 12:25 -0500, David Malcolm wrote:
> On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:
> > On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:
> > > g++.dg/lambda/lambda-switch.C Has recently regressed.  
> > 
> > g++.dg/cpp0x/lambda/lambda-switch.C
> > 
> > > It appears the 
> > > location of a warning message has moved.
> > > 
> > > l = []()  // { dg-warning "statement will never
> > > be executed" }
> > >   {
> > >   case 3: // { dg-error "case" }
> > > break;// { dg-error "break" }
> > >   };  <--- warning now here
> > > 
> > > We seem to be diagnosing the last line of the statement, not the
> > > first. 
> > > That seems not a useful.
> > > 
> > > I've not investigated what patch may have caused this, on the
> > > chance 
> > > someone might already know?
> > > 
> > > nathan
> > 
> > The warning was added in r236597 (aka
> > 1398da0f786e120bb0b407e84f412aa9fc6d80ee):
> > 
> > +2016-05-23  Marek Polacek  
> > +
> > +   PR c/49859
> > +   * common.opt (Wswitch-unreachable): New option.
> > +   * doc/invoke.texi: Document -Wswitch-unreachable.
> > +   * gimplify.c (gimplify_switch_expr): Implement the
> > -Wswitch-
> > unreachable
> > +   warning.
> > 
> > which had it at there (23:7).
> > 
> > r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
> > warning from 23:7 up to the "[] ()" at 19:6 in:
> > 
> > +2017-01-20  Marek Polacek  
> > +
> > +   PR c/64279
> > [...snip...]
> > +   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.
> > 
> > I tried it with some working copies I have to hand:
> > - works for me with r254387 (2017-11-03)
> > - fails for me with r254700 (2017-11-13)
> > 
> > so hopefully that helps track it down.
> > 
> > Dave
> 
> Searching in the November archives of the gcc-regression ML for
> "lambda-switch.c":
> 
> https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd&form=extended&m=all&s=D
> &q=lambda-switch.c&ul=%2Fml%2Fgcc-regression%2F2017-11%2F%25
> 
> showed e.g.:
>   https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
>"Regressions on trunk at revision 254648 vs revision 254623"
> 
> which says this is a new failure somewhere in that range; so it
> presumably happened sometime on 2017-11-10 after r254623 and up to
> (maybe ==) r254648.
> 
> Looking at:
>svn log -r r254623:r254648 |less
> nothing jumps out at me as being related.
> 
> Hope this is helpful
> Dave

Actually, https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00157.html
has a tighter range: r254628 vs r254635.

Looking at:
  svn log -r r254628:r254635 |less
I see msebor's r254630 ("PR c/81117 - Improve buffer overflow checking
in strncpy") has:

* gimple.c (gimple_build_call_from_tree): Set call location.

with:
+  gimple_set_location (call, EXPR_LOCATION (t));

Maybe that's it?  (nothing else in that commit range seems to affect
locations).

Dave


Re: [PATCH] make canonicalize_condition keep its promise

2017-11-15 Thread Peter Bergner
On 11/15/17 9:40 AM, Aaron Sawdey wrote:
> Index: gcc/rtlanal.c
> ===
> --- gcc/rtlanal.c   (revision 254553)
> +++ gcc/rtlanal.c   (working copy)
> @@ -5623,7 +5623,11 @@
>if (CC0_P (op0))
>  return 0;
>  
> -  return gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
> +  /* We promised to return a comparison.  */
> +  rtx ret = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
> +  if (COMPARISON_P (ret))
> +return ret;
> +  return 0;

I have no input on whether this approach is correct or not, but...
I know the return above this returns 0 as do other locations in
the file, but new code should return NULL_RTX.

Peter



[PATCH] Add noexcept to generic std::size, std::empty and std::data

2017-11-15 Thread Jonathan Wakely

The standard doesn't say these are noexcept, but they can be.

* include/bits/range_access.h (size, empty, data): Add conditional
noexcept to generic overloads.

Tested powerpc64le-linux, committed to trunk.


commit 9348811e74851f9ce6594cbe1b98a855193867dc
Author: Jonathan Wakely 
Date:   Wed Nov 15 17:38:28 2017 +

Add noexcept to generic std::size, std::empty and std::data

* include/bits/range_access.h (size, empty, data): Add conditional
noexcept to generic overloads.

diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index 3987c2addf1..2a037ad8082 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #endif // C++14
 
-#if __cplusplus > 201402L
+#if __cplusplus >= 201703L
 #define __cpp_lib_nonmember_container_access 201411
 
   /**
@@ -239,7 +239,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-size(const _Container& __cont) -> decltype(__cont.size())
+size(const _Container& __cont) noexcept(noexcept(__cont.size()))
+-> decltype(__cont.size())
 { return __cont.size(); }
 
   /**
@@ -257,7 +258,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-empty(const _Container& __cont) -> decltype(__cont.empty())
+empty(const _Container& __cont) noexcept(noexcept(__cont.empty()))
+-> decltype(__cont.empty())
 { return __cont.empty(); }
 
   /**
@@ -284,7 +286,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-data(_Container& __cont) -> decltype(__cont.data())
+data(_Container& __cont) noexcept(noexcept(__cont.data()))
+-> decltype(__cont.data())
 { return __cont.data(); }
 
   /**
@@ -293,7 +296,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template 
 constexpr auto
-data(const _Container& __cont) -> decltype(__cont.data())
+data(const _Container& __cont) noexcept(noexcept(__cont.data()))
+-> decltype(__cont.data())
 { return __cont.data(); }
 
   /**


Re: [PATCH] PR fortran/78240 -- kludge of the day

2017-11-15 Thread Steve Kargl
On Tue, Nov 14, 2017 at 05:21:41PM -0500, Fritz Reese wrote:
> On Tue, Nov 14, 2017 at 4:58 PM, Janus Weil  wrote:
> > Hi guys,
> >
> > I see this new test case failing on x86_64-linux-gnu:
> >
> > FAIL: gfortran.dg/pr78240.f90   -O  (test for excess errors)
> >
> >
> > $ gfortran-8 pr78240.f90
> > pr78240.f90:11:12:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> > 1
> > Error: Variable ‘n’ cannot appear in the expression at (1)
> > pr78240.f90:11:14:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> >   1
> > Error: The module or main program array ‘x’ at (1) must have constant shape
> > pr78240.f90:11:19:
> >
> >integer x(n)/1/   ! { dg-error "Nonconstant array" }
> >1
> > Error: Nonconstant array section at (1) in DATA statement
> > [...]
> 
> ... does anyone know how to tell dejagnu to expect multiple errors on
> a single line?
> 

I've fixed the problem with this patch.

2017-11-15  Steven G. Kargl  

PR fortran/78240
gfortran.dg/pr78240.f90: Prune run-on errors.


Index: gcc/testsuite/gfortran.dg/pr78240.f90
===
--- gcc/testsuite/gfortran.dg/pr78240.f90   (revision 254779)
+++ gcc/testsuite/gfortran.dg/pr78240.f90   (working copy)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-options "-w" }
 !
 ! PR fortran/78240
 !
@@ -8,5 +9,7 @@
 !
 
 program p
-  integer x(n)/1/   ! { dg-error "Nonconstant array" }
+  integer x(n)/1/   ! { dg-error "cannot appear in the expression" }
 end
+! { dg-prune-output "module or main program" }
+! { dg-prune-output "Nonconstant array" }

-- 
Steve


Re: [PATCH] PR fortran/78240 -- kludge of the day

2017-11-15 Thread Fritz Reese
On Wed, Nov 15, 2017 at 1:13 PM, Steve Kargl
 wrote:
> On Tue, Nov 14, 2017 at 05:21:41PM -0500, Fritz Reese wrote:
>> On Tue, Nov 14, 2017 at 4:58 PM, Janus Weil  wrote:
>> > Hi guys,
>> >
>> > I see this new test case failing on x86_64-linux-gnu:
>> >
>> > FAIL: gfortran.dg/pr78240.f90   -O  (test for excess errors)
...
>>
>
> I've fixed the problem with this patch.
>
> 2017-11-15  Steven G. Kargl  
>
> PR fortran/78240
> gfortran.dg/pr78240.f90: Prune run-on errors.
>
>
> Index: gcc/testsuite/gfortran.dg/pr78240.f90
> ===
> --- gcc/testsuite/gfortran.dg/pr78240.f90   (revision 254779)
> +++ gcc/testsuite/gfortran.dg/pr78240.f90   (working copy)
> @@ -1,4 +1,5 @@
>  ! { dg-do compile }
> +! { dg-options "-w" }
>  !
>  ! PR fortran/78240
>  !
> @@ -8,5 +9,7 @@
>  !
>
>  program p
> -  integer x(n)/1/   ! { dg-error "Nonconstant array" }
> +  integer x(n)/1/   ! { dg-error "cannot appear in the expression" }
>  end
> +! { dg-prune-output "module or main program" }
> +! { dg-prune-output "Nonconstant array" }
>
> --
> Steve


Thanks! I was planning to commit the very same.

---
Fritz Reese


[PATCH] Minor improvements to Filesystem tests

2017-11-15 Thread Jonathan Wakely

Make these tests a little more robust.

* testsuite/27_io/filesystem/iterators/directory_iterator.cc: Leave
error_code unset.
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Check for past-the-end before dereferencing.
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.

commit ff95dc810ac57a0277d62bb122f7912d37a7cfd5
Author: Jonathan Wakely 
Date:   Wed Nov 15 18:10:52 2017 +

Minor improvements to Filesystem tests

* testsuite/27_io/filesystem/iterators/directory_iterator.cc: Leave
error_code unset.
* 
testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Check for past-the-end before dereferencing.
* testsuite/experimental/filesystem/iterators/
recursive_directory_iterator.cc: Likewise.

diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
index c3e6f01670a..9cdbd7aafa0 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
@@ -61,7 +61,6 @@ test01()
   ec = bad_ec;
   permissions(p, fs::perms::none, ec);
   VERIFY( !ec );
-  ec = bad_ec;
   iter = fs::directory_iterator(p, ec);
   VERIFY( ec );
   VERIFY( iter == end(iter) );
diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
index 1ef450fc907..d41a1506d3b 100644
--- 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
+++ 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc
@@ -87,6 +87,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   iter.increment(ec);  // should fail to recurse into p/d1/d2
   VERIFY( ec );
@@ -99,6 +100,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   ec = bad_ec;
   iter.increment(ec);  // should fail to recurse into p/d1/d2, so skip it
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
index 50cc7d45de8..584cfeed839 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc
@@ -56,6 +56,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   ++iter;
   VERIFY( iter == end(iter) );
@@ -88,6 +89,7 @@ test01()
   VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1" );
   ++iter;  // should recurse into d1
+  VERIFY( iter != end(iter) );
   VERIFY( iter->path() == p/"d1/d2" );
   iter.increment(ec);  // should fail to recurse into p/d1/d2
   VERIFY( ec );


Re: [PATCH, rs6000] Correct some Power9 scheduling info

2017-11-15 Thread Pat Haugen
On 09/27/2017 12:56 PM, Pat Haugen wrote:
> The following patch corrects some Power9 resource requirements and
> instruction latencies. Bootstrap/regtest on powerpc64le-linux with no
> new regressions. Ok for trunk?

Updated patch follows. Bootstrap/regtest on powerpc64le-linux (Power9)
with no regressions. Ok for trunk?

-Pat

2017-11-15  Pat Haugen  

* rs6000/power9.md (power9fpdiv): New automaton and cpu_unit defined
for it.
(DU_C2_3_power9): Correct reservation combinations.
(FP_DIV_power9, VEC_DIV_power9): New.
(power9-alu): Split out rotate/shift...
(power9-rot): ...to here, correct dispatch resource.
(power9-cracked-alu, power9-mul, power9-mul-compare): Correct dispatch
resource.
(power9-fp): Correct latency.
(power9-sdiv): Add div/sqrt resource.
(power9-ddiv): Correct latency, add div/sqrt resource.
(power9-sqrt, power9-dsqrt): Add div/sqrt resource.
(power9-vecfdiv, power9-vecdiv): Correct latency, add div/sqrt
resource.
(power9-qpdiv, power9-qpmul): Adjust resource usage.


Index: gcc/config/rs6000/power9.md
===
--- gcc/config/rs6000/power9.md	(revision 254708)
+++ gcc/config/rs6000/power9.md	(working copy)
@@ -19,7 +19,7 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-(define_automaton "power9dsp,power9lsu,power9vsu,power9misc")
+(define_automaton "power9dsp,power9lsu,power9vsu,power9fpdiv,power9misc")
 
 (define_cpu_unit "lsu0_power9,lsu1_power9,lsu2_power9,lsu3_power9" "power9lsu")
 (define_cpu_unit "vsu0_power9,vsu1_power9,vsu2_power9,vsu3_power9" "power9vsu")
@@ -28,7 +28,11 @@
 ; Two fixed point divide units, not pipelined
 (define_cpu_unit "fx_div0_power9,fx_div1_power9" "power9misc")
 (define_cpu_unit "bru_power9,cryptu_power9,dfu_power9" "power9misc")
+; Create a false unit for use by non-pipelined FP div/sqrt
+(define_cpu_unit "fp_div0_power9,fp_div1_power9,fp_div2_power9,fp_div3_power9"
+		 "power9fpdiv")
 
+
 (define_cpu_unit "x0_power9,x1_power9,xa0_power9,xa1_power9,
 		  x2_power9,x3_power9,xb0_power9,xb1_power9,
 		  br0_power9,br1_power9" "power9dsp")
@@ -79,8 +83,7 @@
 
 ; 2-way cracked plus 3rd slot
 (define_reservation "DU_C2_3_power9" "x0_power9+x1_power9+xa0_power9|
-  x1_power9+x2_power9+xa0_power9|
-  x1_power9+x2_power9+xb0_power9|
+  x1_power9+x2_power9+xa1_power9|
   x2_power9+x3_power9+xb0_power9")
 
 ; 3-way cracked (consumes whole decode/dispatch cycle)
@@ -108,7 +111,19 @@
 
 (define_reservation "VSU_PRM_power9" "prm0_power9|prm1_power9")
 
+; Define the reservation to be used by FP div/sqrt which allows other insns
+; to be issued to the VSU, but blocks other div/sqrt for a number of cycles.
+; Note that the number of cycles blocked varies depending on insn, but we
+; just use the same number for all in order to keep the number of DFA states
+; reasonable.
+(define_reservation "FP_DIV_power9"
+		"fp_div0_power9*8|fp_div1_power9*8|fp_div2_power9*8|
+		 fp_div3_power9*8")
+(define_reservation "VEC_DIV_power9"
+		"fp_div0_power9*8+fp_div1_power9*8|
+		 fp_div2_power9*8+fp_div3_power9*8")
 
+
 ; LS Unit
 (define_insn_reservation "power9-load" 4
   (and (eq_attr "type" "load")
@@ -243,9 +258,7 @@
 
 ; Most ALU insns are simple 2 cycle, including record form
 (define_insn_reservation "power9-alu" 2
-  (and (ior (eq_attr "type" "add,exts,integer,logical,isel")
-	(and (eq_attr "type" "insert,shift")
-		 (eq_attr "dot" "no")))
+  (and (eq_attr "type" "add,exts,integer,logical,isel")
(eq_attr "cpu" "power9"))
   "DU_any_power9,VSU_power9")
 ; 5 cycle CR latency
@@ -252,12 +265,19 @@
 (define_bypass 5 "power9-alu"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
 
+; Rotate/shift prevent use of third slot
+(define_insn_reservation "power9-rot" 2
+  (and (eq_attr "type" "insert,shift")
+   (eq_attr "dot" "no")
+   (eq_attr "cpu" "power9"))
+  "DU_slice_3_power9,VSU_power9")
+
 ; Record form rotate/shift are cracked
 (define_insn_reservation "power9-cracked-alu" 2
   (and (eq_attr "type" "insert,shift")
(eq_attr "dot" "yes")
(eq_attr "cpu" "power9"))
-  "DU_C2_power9,VSU_power9")
+  "DU_C2_3_power9,VSU_power9")
 ; 7 cycle CR latency
 (define_bypass 7 "power9-cracked-alu"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
@@ -291,13 +311,13 @@
   (and (eq_attr "type" "mul")
(eq_attr "dot" "no")
(eq_attr "cpu" "power9"))
-  "DU_any_power9,VSU_power9")
+  "DU_slice_3_power9,VSU_power9")
 
 (define_insn_reservation "power9-mul-compare" 5
   (and (eq_attr "type" "mul")
(eq_attr "dot" "yes")
(eq_attr "cpu" "power9"))
-  "DU_C2_power9,VSU_power9")
+  "DU_C2_3_power9,VSU_power9")
 ; 10 cycle CR latency
 (define_bypass 10 "power9-mul-compare"
 		 "power9-crlogical,power9-mfcr,power9-mfcrf")
@@ -349,7 +369,7 @@
(eq_attr "cpu" "power9"))
 

Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 5:59 PM, H.J. Lu  wrote:
> On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak  wrote:
>> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu  wrote:
>>> -mzeroupper is specified to generate vzeroupper instruction.  If it
>>> isn't used, the default should depend on !TARGET_AVX512ER.  Users can
>>> always use -mzeroupper or -mno-zeroupper to override it.
>>>
>>> Sebastian, can you run the full test with it?
>>>
>>> OK for trunk if there is no regression?
>>
>> If we want to go this way, please add relevant tune flag (e.g.
>> X86_TUNE_EMIT_VZEROUPPER) and use it for ~m_KNL. This tune is the
>> property of the processor model, not ISA.
>
> How about this?  OK for trunk if there are no regressions?

> gcc/
>
> PR target/82990
> * config/i386/i386.c (pass_insert_vzeroupper::gate): Remove
> TARGET_AVX512ER check.
> (ix86_option_override_internal): Set MASK_VZEROUPPER if
> neither -mzeroupper nor -mno-zeroupper is used and
> TARGET_EMIT_VZEROUPPER is set.
> * config/i386/i386.h (TARGET_EMIT_VZEROUPPER): New.
> * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER.
>
> gcc/testsuite/
>
> PR target/82990
> * gcc.target/i386/pr82942-2.c: Add -mtune=knl.
> * gcc.target/i386/pr82990-1.c: New test.
> * gcc.target/i386/pr82990-2.c: Likewise.
> * gcc.target/i386/pr82990-3.c: Likewise.
> * gcc.target/i386/pr82990-4.c: Likewise.
> * gcc.target/i386/pr82990-5.c: Likewise.
> * gcc.target/i386/pr82990-6.c: Likewise.
> * gcc.target/i386/pr82990-7.c: Likewise.

OK.

Thanks,
Uros.


[PING**2] [PATCH] Add a warning for invalid function casts

2017-11-15 Thread Bernd Edlinger
Ping...

On 11/08/17 17:55, Bernd Edlinger wrote:
> Ping...
> 
> for the C++ part of this patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00559.html
> 
> 
> Thanks
> Bernd.
> 
>> On 10/10/17 00:30, Bernd Edlinger wrote:
>>> On 10/09/17 20:34, Martin Sebor wrote:
 On 10/09/2017 11:50 AM, Bernd Edlinger wrote:
> On 10/09/17 18:44, Martin Sebor wrote:
>> On 10/07/2017 10:48 AM, Bernd Edlinger wrote:
>>> Hi!
>>>
>>> I think I have now something useful, it has a few more heuristics
>>> added, to reduce the number of false-positives so that it
>>> is able to find real bugs, for instance in openssl it triggers
>>> at a function cast which has already a TODO on it.
>>>
>>> The heuristics are:
>>> - handle void (*)(void) as a wild-card function type.
>>> - ignore volatile, const qualifiers on parameters/return.
>>> - handle any pointers as equivalent.
>>> - handle integral types, enums, and booleans of same precision
>>>     and signedness as equivalent.
>>> - stop parameter validation at the first "...".
>>
>> These sound quite reasonable to me.  I have a reservation about
>> just one of them, and some comments about other aspects of the
>> warning.  Sorry if this seems like a lot.  I'm hoping you'll
>> find the feedback constructive.
>>
>> I don't think using void(*)(void) to suppress the warning is
>> a robust solution because it's not safe to call a function that
>> takes arguments through such a pointer (especially not if one
>> or more of the arguments is a pointer).  Depending on the ABI,
>> calling a function that expects arguments with none could also
>> mess up the stack as the callee may pop arguments that were
>> never passed to it.
>>
>
> This is of course only a heuristic, and if there is no warning
> that does not mean any guarantee that there can't be a problem
> at runtime.  The heuristic is only meant to separate the
> bad from the very bad type-cast.  In my personal opinion there
> is not a single good type cast.

 I agree.  Since the warning uses one kind of a cast as an escape
 mechanism from the checking it should be one whose result can
 the most likely be used to call the function without undefined
 behavior.

 Since it's possible to call any function through a pointer to
 a function with no arguments (simply by providing arguments of
 matching types) it's a reasonable candidate.

 On the other hand, since it is not safe to call an arbitrary
 function through void (*)(void), it's not as good a candidate.

 Another reason why I think a protoype-less function is a good
 choice is because the alias and ifunc attributes already use it
 as an escape mechanism from their type incompatibility warning.

>>>
>>> I know of pre-existing code-bases where a type-cast to type:
>>> void (*) (void);
>>>
>>> .. is already used as a generic function pointer: libffi and
>>> libgo, I would not want to break these.
>>>
>>> Actually when I have a type:
>>> X (*) (...);
>>>
>>> I would like to make sure that the warning checks that
>>> only functions returning X are assigned.
>>>
>>> and for X (*) (Y, );
>>>
>>> I would like to check that anything returning X with
>>> first argument of type Y is assigned.
>>>
>>> There are code bases where such a scheme is used.
>>> For instance one that I myself maintain: the OPC/UA AnsiC Stack,
>>> where I have this type definition:
>>>
>>> typedef OpcUa_StatusCode (OpcUa_PfnInvokeService)(OpcUa_Endpoint
>>> hEndpoint, ...);
>>>
>>> And this plays well together with this warning, because only
>>> functions are assigned that match up to the ...);
>>> Afterwards this pointer is cast back to the original signature,
>>> so everything is perfectly fine.
>>>
>>> Regarding the cast from pointer to member to function, I see also a
>>> warning without -Wpedantic:
>>> Warnung: converting from »void (S::*)(int*)« to »void (*)(int*)«
>>> [-Wpmf-conversions]
>>>  F *pf = (F*)&S::foo;
>>>  ^~~
>>>
>>> And this one is even default-enabled, so I think that should be
>>> more than sufficient.
>>>
>>> I also changed the heuristic, so that your example with the enum should
>>> now work.  I did not add it to the test case, because it would
>>> break with -fshort-enums :(
>>>
>>> Attached I have an updated patch that extends this warning to the
>>> pointer-to-member function cast, and relaxes the heuristic on the
>>> benign integral type differences a bit further.
>>>
>>>
>>> Is it OK for trunk after bootstrap and reg-testing?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>


Re: Hurd port for gcc-7 go PATCH 1-3(15)

2017-11-15 Thread Matthias Klose
On 06.11.2017 16:36, Svante Signell wrote:
> Hi,
> 
> Attached are patches to enable gccgo to build properly on Debian
> GNU/Hurd on gcc-7 (7-7.2.0-12).

sysinfo.go:6744:7: error: redefinition of 'SYS_IOCTL'
 const SYS_IOCTL = _SYS_ioctl
   ^
sysinfo.go:6403:7: note: previous definition of 'SYS_IOCTL' was here
 const SYS_IOCTL = 0
   ^
the patches break the build on any Linux architecture.  Please could you test
your patches against a linux target as well?


[PATCH 4/4] libstdc++: immutable _M_sbuf in istreambuf_iterator

2017-11-15 Thread Petr Ovtchenkov
No needs to have mutable _M_sbuf in istreambuf_iterator
more.
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 203da9d..e2b6707 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -94,7 +94,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // the "end of stream" iterator value.
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
-  mutable streambuf_type*  _M_sbuf;
+  streambuf_type* _M_sbuf;
 
 public:
   class proxy
-- 
2.10.1



[PATCH 3/4] libstdc++: avoid character accumulation in istreambuf_iterator

2017-11-15 Thread Petr Ovtchenkov
Ask associated streambuf for character when needed instead of
accumulate it in istreambuf_iterator object.

Benefits from this:
  - minus one class member in istreambuf_iterator
  - trivial synchronization of states of istreambuf_iterator
and associated streambuf
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 34 --
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 08fb13b..203da9d 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,19 +95,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
   mutable streambuf_type*  _M_sbuf;
-  mutable int_type _M_c;
 
 public:
   class proxy
   {
   friend class istreambuf_iterator;
   private:
-  proxy(int_type c, streambuf_type*sbuf_) :
+  proxy(int_type c, streambuf_type* sbuf_) :
   _M_c(c),
   _M_sbuf(sbuf_)
   { }
   int_type _M_c;
-  streambuf_type*  _M_sbuf;
+  streambuf_type* _M_sbuf;
 
   public:
   char_type
@@ -118,7 +117,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   ///  Construct end of input stream iterator.
   _GLIBCXX_CONSTEXPR istreambuf_iterator() _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(0), _M_c(traits_type::eof()) { }
+  : _M_sbuf(0) { }
 
 #if __cplusplus >= 201103L
   istreambuf_iterator(const istreambuf_iterator&) noexcept = default;
@@ -128,15 +127,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   ///  Construct start of input stream iterator.
   istreambuf_iterator(istream_type& __s) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__s.rdbuf()), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__s.rdbuf()) { }
 
   ///  Construct start of streambuf iterator.
   istreambuf_iterator(streambuf_type* __s) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__s), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__s) { }
 
   ///  Construct start of istreambuf iterator.
   istreambuf_iterator(const proxy& __p) _GLIBCXX_USE_NOEXCEPT
-  : _M_sbuf(__p._M_sbuf), _M_c(traits_type::eof()) { }
+  : _M_sbuf(__p._M_sbuf) { }
 
   ///  Return the current character pointed to by iterator.  This returns
   ///  streambuf.sgetc().  It cannot be assigned.  NB: The result of
@@ -147,11 +146,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifdef _GLIBCXX_DEBUG_PEDANTIC
// Dereferencing a past-the-end istreambuf_iterator is a
// libstdc++ extension
-   __glibcxx_requires_cond(!_M_at_eof(),
+   int_type __tmp = _M_get();
+   
__glibcxx_requires_cond(!traits_type::eq_int_type(__tmp,traits_type::eof()),
_M_message(__gnu_debug::__msg_deref_istreambuf)
._M_iterator(*this));
-#endif
+   return traits_type::to_char_type(__tmp);
+#else
return traits_type::to_char_type(_M_get());
+#endif
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
@@ -172,7 +174,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 #endif
-   _M_c = traits_type::eof();
  }
return *this;
   }
@@ -181,17 +182,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   proxy
   operator++(int)
   {
-_M_get();
-   __glibcxx_requires_cond(_M_sbuf
-   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
+int_type c = _M_get();
+   __glibcxx_requires_cond(!traits_type::eq_int_type(c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
-   proxy __old(_M_c, _M_sbuf);
+   proxy __old(c, _M_sbuf);
if (_M_sbuf)
  {
_M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
  }
return __old;
   }
@@ -209,9 +208,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_get() const
   {
const int_type __eof = traits_type::eof();
-   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
-  _M_c = _M_sbuf->sgetc();
-   return _M_c;
+   return _M_sbuf ? _M_sbuf->sgetc() : __eof;
   }
 
   bool
@@ -418,7 +415,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  else
__c = __sb->snextc();
}
- __first._M_c = __c;
}
   return __first;
 }
-- 
2.10.1



[PATCH 1/4] Revert "2017-10-04 Petr Ovtchenkov "

2017-11-15 Thread Petr Ovtchenkov
This reverts commit 0dfbafdf338cc6899d146add5161e52efb02c067
(svn r253417).
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 59 ++
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 081afe5..69ee013 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,7 +95,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // NB: This implementation assumes the "end of stream" value
   // is EOF, or -1.
   mutable streambuf_type*  _M_sbuf;
-  int_type _M_c;
+  mutable int_type _M_c;
 
 public:
   ///  Construct end of input stream iterator.
@@ -122,29 +122,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   char_type
   operator*() const
   {
-   int_type __c = _M_get();
-
 #ifdef _GLIBCXX_DEBUG_PEDANTIC
// Dereferencing a past-the-end istreambuf_iterator is a
// libstdc++ extension
-   __glibcxx_requires_cond(!_S_is_eof(__c),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_deref_istreambuf)
._M_iterator(*this));
 #endif
-   return traits_type::to_char_type(__c);
+   return traits_type::to_char_type(_M_get());
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
   istreambuf_iterator&
   operator++()
   {
-   __glibcxx_requires_cond(_M_sbuf &&
-   (!_S_is_eof(_M_c) || 
!_S_is_eof(_M_sbuf->sgetc())),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
-
-   _M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
+   if (_M_sbuf)
+ {
+   _M_sbuf->sbumpc();
+   _M_c = traits_type::eof();
+ }
return *this;
   }
 
@@ -152,14 +151,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator
   operator++(int)
   {
-   __glibcxx_requires_cond(_M_sbuf &&
-   (!_S_is_eof(_M_c) || 
!_S_is_eof(_M_sbuf->sgetc())),
+   __glibcxx_requires_cond(!_M_at_eof(),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
istreambuf_iterator __old = *this;
-   __old._M_c = _M_sbuf->sbumpc();
-   _M_c = traits_type::eof();
+   if (_M_sbuf)
+ {
+   __old._M_c = _M_sbuf->sbumpc();
+   _M_c = traits_type::eof();
+ }
return __old;
   }
 
@@ -175,21 +176,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   int_type
   _M_get() const
   {
-   int_type __ret = _M_c;
-   if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret = _M_sbuf->sgetc()))
- _M_sbuf = 0;
+   const int_type __eof = traits_type::eof();
+   int_type __ret = __eof;
+   if (_M_sbuf)
+ {
+   if (!traits_type::eq_int_type(_M_c, __eof))
+ __ret = _M_c;
+   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
+  __eof))
+ _M_c = __ret;
+   else
+ _M_sbuf = 0;
+ }
return __ret;
   }
 
   bool
   _M_at_eof() const
-  { return _S_is_eof(_M_get()); }
-
-  static bool
-  _S_is_eof(int_type __c)
   {
const int_type __eof = traits_type::eof();
-   return traits_type::eq_int_type(__c, __eof);
+   return traits_type::eq_int_type(_M_get(), __eof);
   }
 };
 
@@ -367,14 +373,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename __is_iterator_type::traits_type traits_type;
   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
   typedef typename traits_type::int_type   int_type;
-  const int_type __eof = traits_type::eof();
 
   if (__first._M_sbuf && !__last._M_sbuf)
{
  const int_type __ival = traits_type::to_int_type(__val);
  streambuf_type* __sb = __first._M_sbuf;
  int_type __c = __sb->sgetc();
- while (!traits_type::eq_int_type(__c, __eof)
+ while (!traits_type::eq_int_type(__c, traits_type::eof())
 && !traits_type::eq_int_type(__c, __ival))
{
  streamsize __n = __sb->egptr() - __sb->gptr();
@@ -391,9 +396,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__c = __sb->snextc();
}
 
- __first._M_c = __eof;
+ if (!traits_type::eq_int_type(__c, traits_type::eof()))
+   __first._M_c = __c;
+ else
+   __first._M_sbuf = 0;
}
-
   return __first;
 }
 
-- 
2.10.1



[PATCH 2/4] libstdc++: istreambuf_iterator keep attached streambuf

2017-11-15 Thread Petr Ovtchenkov
istreambuf_iterator should not forget about attached
streambuf when it reach EOF.

Checks in debug mode has no infuence more on character
extraction in istreambuf_iterator increment operators.
In this aspect behaviour in debug and non-debug mode
is similar now.

Test for detached srteambuf in istreambuf_iterator:
When istreambuf_iterator reach EOF of istream, it should not
forget about attached streambuf.
>From fact "EOF in stream reached" not follow that
stream reach end of life and input operation impossible
more.

postfix increment (r++) return proxy object, due to

  copies of the previous value of r are no longer
  required either to be dereferenceable or to be in
  the domain of ==.

i.e. type that usable only for dereference and extraction
"previous" character.

istreambuf_iterator should has ctor from proxy object,
so proxy should store pointer to streambuf object.
---
 libstdc++-v3/include/bits/streambuf_iterator.h | 67 ++
 .../24_iterators/istreambuf_iterator/3.cc  | 66 +
 2 files changed, 109 insertions(+), 24 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/3.cc

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 69ee013..08fb13b 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -98,6 +98,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   mutable int_type _M_c;
 
 public:
+  class proxy
+  {
+  friend class istreambuf_iterator;
+  private:
+  proxy(int_type c, streambuf_type*sbuf_) :
+  _M_c(c),
+  _M_sbuf(sbuf_)
+  { }
+  int_type _M_c;
+  streambuf_type*  _M_sbuf;
+
+  public:
+  char_type
+  operator*() const
+  { return traits_type::to_char_type(_M_c); }
+  };
+
+public:
   ///  Construct end of input stream iterator.
   _GLIBCXX_CONSTEXPR istreambuf_iterator() _GLIBCXX_USE_NOEXCEPT
   : _M_sbuf(0), _M_c(traits_type::eof()) { }
@@ -116,6 +134,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator(streambuf_type* __s) _GLIBCXX_USE_NOEXCEPT
   : _M_sbuf(__s), _M_c(traits_type::eof()) { }
 
+  ///  Construct start of istreambuf iterator.
+  istreambuf_iterator(const proxy& __p) _GLIBCXX_USE_NOEXCEPT
+  : _M_sbuf(__p._M_sbuf), _M_c(traits_type::eof()) { }
+
   ///  Return the current character pointed to by iterator.  This returns
   ///  streambuf.sgetc().  It cannot be assigned.  NB: The result of
   ///  operator*() on an end of stream is undefined.
@@ -136,29 +158,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   istreambuf_iterator&
   operator++()
   {
-   __glibcxx_requires_cond(!_M_at_eof(),
+   __glibcxx_requires_cond(_M_sbuf,
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
if (_M_sbuf)
  {
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   int_type __tmp =
+#endif
_M_sbuf->sbumpc();
+#ifdef _GLIBCXX_DEBUG_PEDANTIC
+   
__glibcxx_requires_cond(!traits_type::eq_int_type(__tmp,traits_type::eof()),
+   
_M_message(__gnu_debug::__msg_inc_istreambuf)
+   ._M_iterator(*this));
+#endif
_M_c = traits_type::eof();
  }
return *this;
   }
 
   /// Advance the iterator.  Calls streambuf.sbumpc().
-  istreambuf_iterator
+  proxy
   operator++(int)
   {
-   __glibcxx_requires_cond(!_M_at_eof(),
+_M_get();
+   __glibcxx_requires_cond(_M_sbuf
+   && 
!traits_type::eq_int_type(_M_c,traits_type::eof()),
_M_message(__gnu_debug::__msg_inc_istreambuf)
._M_iterator(*this));
 
-   istreambuf_iterator __old = *this;
+   proxy __old(_M_c, _M_sbuf);
if (_M_sbuf)
  {
-   __old._M_c = _M_sbuf->sbumpc();
+   _M_sbuf->sbumpc();
_M_c = traits_type::eof();
  }
return __old;
@@ -177,18 +209,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_get() const
   {
const int_type __eof = traits_type::eof();
-   int_type __ret = __eof;
-   if (_M_sbuf)
- {
-   if (!traits_type::eq_int_type(_M_c, __eof))
- __ret = _M_c;
-   else if (!traits_type::eq_int_type((__ret = _M_sbuf->sgetc()),
-  __eof))
- _M_c = __ret;
-   else
- _M_sbuf = 0;
- }
-   return __ret;
+   if (_M_sbuf && traits_type::eq_int_type(_M_c, __eof))
+  _M_c = _M_sbuf->sgetc();
+   return _M_c;
   }
 
   bool
@@ -339,7 +362,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename __is_it

Re: Hurd port for gcc-7 go PATCH 1-3(15)

2017-11-15 Thread Svante Signell
On Wed, 2017-11-15 at 21:40 +0100, Matthias Klose wrote:
> On 06.11.2017 16:36, Svante Signell wrote:
> > Hi,
> > 
> > Attached are patches to enable gccgo to build properly on Debian
> > GNU/Hurd on gcc-7 (7-7.2.0-12).
> 
> sysinfo.go:6744:7: error: redefinition of 'SYS_IOCTL'
>  const SYS_IOCTL = _SYS_ioctl
>    ^
> sysinfo.go:6403:7: note: previous definition of 'SYS_IOCTL' was here
>  const SYS_IOCTL = 0
>    ^
> the patches break the build on any Linux architecture.  Please could you test
> your patches against a linux target as well?

I'm really sorry. I regularly do that, but missed this one for gcc-7. Do you
mean the patches against gcc-8 you asked me for? You wrote that gcc-7 is not of
interest and I should concentrate on gcc-8.

Again, I'm really sorry. Wil fix this tomorrow hopefully.

Thanks!



Re: lambda-switch regression

2017-11-15 Thread Martin Sebor

On 11/15/2017 10:38 AM, David Malcolm wrote:

On Wed, 2017-11-15 at 12:25 -0500, David Malcolm wrote:

On Wed, 2017-11-15 at 12:06 -0500, David Malcolm wrote:

On Wed, 2017-11-15 at 08:03 -0500, Nathan Sidwell wrote:

g++.dg/lambda/lambda-switch.C Has recently regressed.


g++.dg/cpp0x/lambda/lambda-switch.C


It appears the
location of a warning message has moved.

  l = []()  // { dg-warning "statement will never
be executed" }
{
case 3: // { dg-error "case" }
  break;// { dg-error "break" }
};  <--- warning now here

We seem to be diagnosing the last line of the statement, not the
first.
That seems not a useful.

I've not investigated what patch may have caused this, on the
chance
someone might already know?

nathan


The warning was added in r236597 (aka
1398da0f786e120bb0b407e84f412aa9fc6d80ee):

+2016-05-23  Marek Polacek  
+
+   PR c/49859
+   * common.opt (Wswitch-unreachable): New option.
+   * doc/invoke.texi: Document -Wswitch-unreachable.
+   * gimplify.c (gimplify_switch_expr): Implement the
-Wswitch-
unreachable
+   warning.

which had it at there (23:7).

r244705 (aka 3ef7eab185e1463c7dbfa2a8d1af5d0120cf9f76) moved the
warning from 23:7 up to the "[] ()" at 19:6 in:

+2017-01-20  Marek Polacek  
+
+   PR c/64279
[...snip...]
+   * g++.dg/cpp0x/lambda/lambda-switch.C: Move dg-warning.

I tried it with some working copies I have to hand:
- works for me with r254387 (2017-11-03)
- fails for me with r254700 (2017-11-13)

so hopefully that helps track it down.

Dave


Searching in the November archives of the gcc-regression ML for
"lambda-switch.c":

https://gcc.gnu.org/cgi-bin/search.cgi?wm=wrd&form=extended&m=all&s=D
&q=lambda-switch.c&ul=%2Fml%2Fgcc-regression%2F2017-11%2F%25

showed e.g.:
  https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00173.html
   "Regressions on trunk at revision 254648 vs revision 254623"

which says this is a new failure somewhere in that range; so it
presumably happened sometime on 2017-11-10 after r254623 and up to
(maybe ==) r254648.

Looking at:
   svn log -r r254623:r254648 |less
nothing jumps out at me as being related.

Hope this is helpful
Dave


Actually, https://gcc.gnu.org/ml/gcc-regression/2017-11/msg00157.html
has a tighter range: r254628 vs r254635.

Looking at:
  svn log -r r254628:r254635 |less
I see msebor's r254630 ("PR c/81117 - Improve buffer overflow checking
in strncpy") has:

* gimple.c (gimple_build_call_from_tree): Set call location.

with:
+  gimple_set_location (call, EXPR_LOCATION (t));

Maybe that's it?  (nothing else in that commit range seems to affect
locations).


Yes, that's it.  Before the change there would be no location
associated with a GIMPLE call seen in gimple-fold.  The location
would only get added later, after folding.

The purpose of the lambda-switch.C test is to verify GCC doesn't
ICE on the ill-formed code.  The warning is incidental to the test
case so I've adjusted it to filter it out.

Martin



[PATCH] fix -mnop-mcount generate 5byte nop in 32bit.

2017-11-15 Thread 박한범
"-mnop-mcount" needs to make 5byte size "nop" instruction.
however recently gcc make only 4byte "nop" in 32bit.
I have test in gcc 5.4, 7.2.


===
bug result
===
080485c5 :
 80485c5:   0f 1f 04 00 nopl   (%eax,%eax,1)
 80485c9:   8d 4c 24 04 lea0x4(%esp),%ecx
 80485cd:   83 e4 f0and$0xfff0,%esp

===
fixed result
===
08048598 :
 8048598:   0f 1f 44 00 01  nopl   0x1(%eax,%eax,1)
 804859d:   8d 4c 24 04 lea0x4(%esp),%ecx
 80485a1:   83 e4 f0and$0xfff0,%esp


is it OK?


===
Index : gcc/config/i386/i386.c
===
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c6ca071..e574de3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -40474,7 +40474,7 @@ static void
 x86_print_call_or_nop (FILE *file, const char *target)
 {
   if (flag_nop_mcount)
-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+fprintf (file, "1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
   else
 fprintf (file, "1:\tcall\t%s\n", target);
 }


Re: [PATCH 3/4] libstdc++: avoid character accumulation in istreambuf_iterator

2017-11-15 Thread Paolo Carlini

Hi,

On 15/11/2017 11:48, Petr Ovtchenkov wrote:

Ask associated streambuf for character when needed instead of
accumulate it in istreambuf_iterator object.

Benefits from this:
   - minus one class member in istreambuf_iterator
   - trivial synchronization of states of istreambuf_iterator
 and associated streambuf
---
  libstdc++-v3/include/bits/streambuf_iterator.h | 34 --
  1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h 
b/libstdc++-v3/include/bits/streambuf_iterator.h
index 08fb13b..203da9d 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -95,19 +95,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// NB: This implementation assumes the "end of stream" value
// is EOF, or -1.
mutable streambuf_type* _M_sbuf;
-  mutable int_type _M_c;
Obviously this would be an ABI-breaking change, which certainly we don't 
want. Unless I missed a detailed discussion of the non-trivial way to 
avoid it in one of the recent threads about these topics...


Paolo.


Re: [PATCH #2], make Float128 built-in functions work with -mabi=ieeelongdouble

2017-11-15 Thread Michael Meissner
David tells me that the patch to enable float128 built-in functions to work
with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
under AIX.

I changed the code to have a separate table, and the first call, I build the
table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
the KF->TF mode conversion will not be done.

I have tested this on a little endian power8 system and there were no
regressions.  Once David verifies that it builds on AIX, can I check this into
the trunk?

2017-11-15  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_expand_builtin): Do not use a
switch to map KFmode built-in functions to TFmode.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 254782)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -16786,27 +16786,45 @@ rs6000_expand_builtin (tree exp, rtx tar
  double (KFmode) or long double is IEEE 128-bit (TFmode).  It is simpler if
  we only define one variant of the built-in function, and switch the code
  when defining it, rather than defining two built-ins and using the
- overload table in rs6000-c.c to switch between the two.  */
+ overload table in rs6000-c.c to switch between the two.  On some systems
+ like AIX, the KF/TF mode insns are not generated, and they return
+ CODE_FOR_nothing.  */
   if (FLOAT128_IEEE_P (TFmode))
-switch (icode)
-  {
-  default:
-   break;
+{
+  struct map_f128 {
+   enum insn_code from;/* KFmode insn code that is in the tables.  */
+   enum insn_code to;  /* TFmode insn code to use instead.  */
+  };
+
+  static enum insn_code map_insn_code[NUM_INSN_CODES];
+  static bool first_time = true;
+  const static struct map_f128 map[] = {
+   { CODE_FOR_sqrtkf2_odd, CODE_FOR_sqrttf2_odd },
+   { CODE_FOR_trunckfdf2_odd,  CODE_FOR_trunctfdf2_odd },
+   { CODE_FOR_addkf3_odd,  CODE_FOR_addtf3_odd },
+   { CODE_FOR_subkf3_odd,  CODE_FOR_subtf3_odd },
+   { CODE_FOR_mulkf3_odd,  CODE_FOR_multf3_odd },
+   { CODE_FOR_divkf3_odd,  CODE_FOR_divtf3_odd },
+   { CODE_FOR_fmakf4_odd,  CODE_FOR_fmatf4_odd },
+   { CODE_FOR_xsxexpqp_kf, CODE_FOR_xsxexpqp_tf },
+   { CODE_FOR_xsxsigqp_kf, CODE_FOR_xsxsigqp_tf },
+   { CODE_FOR_xststdcnegqp_kf, CODE_FOR_xststdcnegqp_tf },
+   { CODE_FOR_xsiexpqp_kf, CODE_FOR_xsiexpqp_tf },
+   { CODE_FOR_xsiexpqpf_kf,CODE_FOR_xsiexpqpf_tf },
+   { CODE_FOR_xststdcqp_kf,CODE_FOR_xststdcqp_tf },
+  };
+
+  if (first_time)
+   {
+ first_time = false;
+ gcc_assert ((int)CODE_FOR_nothing == 0);
+ for (i = 0; i < ARRAY_SIZE (map); i++)
+   map_insn_code[(int)map[i].from] = map[i].to;
+   }
 
-  case CODE_FOR_sqrtkf2_odd:   icode = CODE_FOR_sqrttf2_odd;   break;
-  case CODE_FOR_trunckfdf2_odd:icode = CODE_FOR_trunctfdf2_odd; break;
-  case CODE_FOR_addkf3_odd:icode = CODE_FOR_addtf3_odd;
break;
-  case CODE_FOR_subkf3_odd:icode = CODE_FOR_subtf3_odd;
break;
-  case CODE_FOR_mulkf3_odd:icode = CODE_FOR_multf3_odd;
break;
-  case CODE_FOR_divkf3_odd:icode = CODE_FOR_divtf3_odd;
break;
-  case CODE_FOR_fmakf4_odd:icode = CODE_FOR_fmatf4_odd;
break;
-  case CODE_FOR_xsxexpqp_kf:   icode = CODE_FOR_xsxexpqp_tf;   break;
-  case CODE_FOR_xsxsigqp_kf:   icode = CODE_FOR_xsxsigqp_tf;   break;
-  case CODE_FOR_xststdcnegqp_kf:   icode = CODE_FOR_xststdcnegqp_tf; break;
-  case CODE_FOR_xsiexpqp_kf:   icode = CODE_FOR_xsiexpqp_tf;   break;
-  case CODE_FOR_xsiexpqpf_kf:  icode = CODE_FOR_xsiexpqpf_tf;  break;
-  case CODE_FOR_xststdcqp_kf:  icode = CODE_FOR_xststdcqp_tf;  break;
-  }
+  if (map_insn_code[(int)icode] != CODE_FOR_nothing)
+   icode = map_insn_code[(int)icode];
+}
 
   if (TARGET_DEBUG_BUILTIN)
 {


Re: [PATCH] fix -mnop-mcount generate 5byte nop in 32bit.

2017-11-15 Thread Uros Bizjak
Hello!

> "-mnop-mcount" needs to make 5byte size "nop" instruction.
> however recently gcc make only 4byte "nop" in 32bit.
> I have test in gcc 5.4, 7.2.

-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+fprintf (file, "1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop.  */

Even the above change is not correct, since it will be assembled in a
different way on 32 bit and 64 bit targets (size prefix will be added
on 64 bit targets). Attached patch fixes this issue by emitting a
stream of bytes.

2017-11-15  Uros Bizjak  

* config/i386/i386.c (x86_print_call_or_nop): Emit 5 byte nop
explicitly as a stream of bytes.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline, will be committed to release branches.

Uros.
Index: i386.c
===
--- i386.c  (revision 254773)
+++ i386.c  (working copy)
@@ -40473,7 +40473,8 @@ static void
 x86_print_call_or_nop (FILE *file, const char *target)
 {
   if (flag_nop_mcount)
-fprintf (file, "1:\tnopl 0x00(%%eax,%%eax,1)\n"); /* 5 byte nop.  */
+/* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */
+fprintf (file, "1:" ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
   else
 fprintf (file, "1:\tcall\t%s\n", target);
 }


  1   2   >