Re: [PATCH] make ggc pick up comp_dir_string() cache value. (dwarf2out.c)

2019-03-21 Thread Richard Biener
On Wed, Mar 20, 2019 at 7:36 PM Otto, Thomas  wrote:
>
> > > > > "ggc_collect() discarding/reusing remap_debug_filename() output,
> > > > > thus producing invalid objects"
> > > >
> > > > Hmm, but AFAICS it can end up on the heap if plain get_src_pwd ()
> > > > result survives.  I vaguely remember GC being happy with heap
> > > > strings (due to identifiers?), but not sure.  Otherwise the patch looks
> > > > obvious enough.
> > >
> > > Good point, remap_filename can return the original filename pointer
> > > (which can point to env["PWD"] or to an allocated string) or a ggc string.
> > >
> > > Since not freeing the string pointed to by a static variable is ok (as
> > > getpwd does it), what about checking if remap_filename just returns
> > > the input filename ptr again, and if not copy the ggc string into an 
> > > allocated
> > > buffer.
> >
> > I think we always want GC allocated storage here since the whole dwarf2out
> > data structures live in GC memory.  That means unconditionally copying there
> > as is done inside the DWARF2_DIR_SHOULD_END_WITH_SEPARATOR path.
>
> Explicitly requesting gc storage (A or B below) doesn't seem to work. The 
> mapped
> wd string is not modified this time but is placed in a very different part of 
> the
> assembly, in .section .debug_str.dwo, and after assembling it does not even 
> make it
> into object, not even in a corrupted form as previously.

With the cache variable moved to file scope and marked GTY(())?  That
sounds odd.

The point is that the garbage collector shouldn't end up at heap
allocated storage
when walking GC allocated data and I see the DWARF attribute eventually using
the result of this function has the char * pointer GTY(()) marked.

> Placing the static cached_wd on function or file level makes no difference, 
> only the
> GTY(()) marker fixes it again. Or of course allocating a buffer myself (C).
>
> And while DWARF2_DIR_SHOULD_END_WITH_SEPARATOR is only true on VMS, if I
> enter this if block and do not specify -fdebug-prefix-map then the final 
> object does
> not even contain the working directory, probably because it is stored in
> the ggc_vec_alloc data. Tested with 8.3 built with 4.9 and now also 8.3 
> itself. Maybe
> gc and static storage do not play well together.
>
> That leaves classic allocation of memory and pointing the cached_wd there. It
> really shouldn't matter since it is just a cache of an otherwise side effect 
> free
> function return value. It would outlive all of its callers anyhow, gc or not.
>
>
>   static const char *cached_wd = NULL;
>   // [...]
>   cached_wd = remap_debug_filename (wd);
>   if (cached_wd != wd)
> {
>   size_t cached_wd_len = strlen (cached_wd);
> A:char *buf = ggc_vec_alloc (cached_wd_len);
> B:char *buf = (char*)ggc_alloc_atomic (cached_wd_len);
> C:char *buf = (char*)xmalloc (cached_wd_len);
>   memcpy (buf, cached_wd, cached_wd_len);
>   cached_wd = buf;
> }
>


Re: [PATCH] Allow lazy construction of hash_{table,set,map}

2019-03-21 Thread Richard Biener
On Wed, 20 Mar 2019, Jakub Jelinek wrote:

> On Wed, Mar 20, 2019 at 05:32:16PM -0400, Jason Merrill wrote:
> > > Does this look reasonable, or do you have other proposals?
> > 
> > IMO if you need to guard usage with
> > 
> > +  if (some_type_hash_table.size () == 0)
> > +   some_type_hash_table.create ();
> > 
> > this isn't any better than
> > 
> >   /* Create the constexpr function table if necessary.  */
> >   if (constexpr_fundef_table == NULL)
> > constexpr_fundef_table
> >   = hash_table::create_ggc (101);

I agree - the proposed API is a bit too fragile to be maintainable
IMHO.

> Well, this is surely more costly in that it allocates both the hash_table
> structure and the memory pointed by it from GC.

You could use

  char constexpr_fundef_table[sizeof(hash-table)];
  bool constexpr_fundef_table_initialized_p = false;

  if (!constexpr_fundef_table_initialized_p)
new (constexpr_fundef_table) hash-table (...);

and hide this pattern behind a new RAII class?

> > Better I think would be to make the member functions handle null m_entries
> > sensibly.
> 
> The goal was not to slow down all current hash_{table,set,map} uses by
> adding runtime checks if m_size is 0 or m_entries is NULL or similar
> (and also grow code size that way).

Yeah, I also think this isn't the very best thing to do.

> I guess another option would be to make the decision whether
> the hash_{table,set,map} is constructed with allocation right away (and no
> runtime checks for it) or if it is lazy another template argument (bool
> Lazy = false before the Allocator template argument), if !Lazy, it would
> work exactly as it is now, if Lazy it would not allocate it in the ctor
> and where needed would add m_entries checks.

That's a possibility, but what about the above RAII trick?  Not sure
if there's a way to avoid the char[] use and instead use "unconstructed"
properly typed storage.

Richard.


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, if e.g. build_aligned_type creates some specially
> aligned variant of some TYPE_MAIN_VARIANT, that type doesn't appear in the
> IL during free_lang_data and later on we call build_aligned_type with the
> same arguments as before, we'll get the previously created aligned variant
> which has not been free_lang_data processed and during LTO streaming we ICE
> because of that.
> 
> The following patch prunes unmarked types from the type variant chain, so
> that we don't find them later on and instead create new type variants from
> the free_lang_data processed types.
> 
> Another change is just to be sure, marking TYPE_CANONICAL if it is not
> already marked, so that we don't get weird behavior if we'd remove it.
> In the usual case TYPE_CANONICAL will be the TYPE_MAIN_VARIANT type we mark
> already anyway.
> 
> The rest of the changes is to make sure the new types we create during
> free_lang_data get marked in the pset, so that we will not try to purge
> them.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

LGTM - any reason you only sometimes conditionalize add_tree_to_fld_list
on the fld->pset.add () result?  IMHO consistency should win over
micro-optimizing the case you know the type will not be in the set.

Thanks,
Richard.

> 2019-03-20  Jan Hubicka  
>   Jakub Jelinek  
> 
>   PR lto/89692
>   * tree.c (fld_type_variant): Call fld->pset.add.
>   (fld_incomplete_type_of): Likewise and don't call add_tree_to_fld_list
>   if it returns true.
>   (fld_process_array_type): Likewise.
>   (free_lang_data_in_type): Purge non-marked types from TYPE_NEXT_VARIANT
>   list.
>   (find_decls_types_r): Call fld_worklist_push for TYPE_CANONICAL (t).
> 
>   * g++.dg/other/pr89692.C: New test.
> 
> --- gcc/tree.c.jj 2019-03-20 12:24:56.44338 +0100
> +++ gcc/tree.c2019-03-20 20:16:12.277091827 +0100
> @@ -5215,6 +5215,7 @@ fld_type_variant (tree first, tree t, st
>if (inner_type)
>  TREE_TYPE (v) = inner_type;
>gcc_checking_assert (fld_type_variant_equal_p (t,v, inner_type));
> +  fld->pset.add (v);
>add_tree_to_fld_list (v, fld);
>return v;
>  }
> @@ -5253,7 +5254,8 @@ fld_process_array_type (tree t, tree t2,
>array = build_array_type_1 (t2, TYPE_DOMAIN (t),
> TYPE_TYPELESS_STORAGE (t), false);
>TYPE_CANONICAL (array) = TYPE_CANONICAL (t);
> -  add_tree_to_fld_list (array, fld);
> +  if (!fld->pset.add (array))
> + add_tree_to_fld_list (array, fld);
>  }
>return array;
>  }
> @@ -5298,7 +5300,8 @@ fld_incomplete_type_of (tree t, struct f
>   TYPE_REF_CAN_ALIAS_ALL (t));
> gcc_assert (TYPE_CANONICAL (t2) != t2
> && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
> -   add_tree_to_fld_list (first, fld);
> +   if (!fld->pset.add (first))
> + add_tree_to_fld_list (first, fld);
> return fld_type_variant (first, t, fld);
>   }
>return t;
> @@ -5321,6 +5324,7 @@ fld_incomplete_type_of (tree t, struct f
> copy = build_distinct_type_copy (t);
>  
> /* It is possible that type was not seen by free_lang_data yet.  */
> +   fld->pset.add (copy);
> add_tree_to_fld_list (copy, fld);
> TYPE_SIZE (copy) = NULL;
> TYPE_USER_ALIGN (copy) = 0;
> @@ -5445,6 +5449,18 @@ free_lang_data_in_type (tree type, struc
>  
>TYPE_NEEDS_CONSTRUCTING (type) = 0;
>  
> +  /* Purge non-marked variants from the variants chain, so that they
> + don't reappear in the IL after free_lang_data.  */
> +  while (TYPE_NEXT_VARIANT (type)
> +  && !fld->pset.contains (TYPE_NEXT_VARIANT (type)))
> +{
> +  tree t = TYPE_NEXT_VARIANT (type);
> +  TYPE_NEXT_VARIANT (type) = TYPE_NEXT_VARIANT (t);
> +  /* Turn the removed types into distinct types.  */
> +  TYPE_MAIN_VARIANT (t) = t;
> +  TYPE_NEXT_VARIANT (t) = NULL_TREE;
> +}
> +
>if (TREE_CODE (type) == FUNCTION_TYPE)
>  {
>TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
> @@ -5464,6 +5480,7 @@ free_lang_data_in_type (tree type, struc
> & ~TYPE_QUAL_CONST
> & ~TYPE_QUAL_VOLATILE;
> TREE_VALUE (p) = build_qualified_type (arg_type, quals);
> +   fld->pset.add (TREE_VALUE (p));
> free_lang_data_in_type (TREE_VALUE (p), fld);
>   }
> /* C++ FE uses TREE_PURPOSE to store initial values.  */
> @@ -5886,8 +5903,7 @@ find_decls_types_r (tree *tp, int *ws, v
>   ctx = BLOCK_SUPERCONTEXT (ctx);
> fld_worklist_push (ctx, fld);
>   }
> -  /* Do not walk TYPE_CANONICAL.  We do not stream it and thus do not
> -  and want not to reach unused types this way.  */
> +  fld_worklist_push (TYPE_CANONICAL (t), fld);
>  
>if (RE

Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 09:35:07AM +0100, Richard Biener wrote:
> LGTM - any reason you only sometimes conditionalize add_tree_to_fld_list
> on the fld->pset.add () result?  IMHO consistency should win over
> micro-optimizing the case you know the type will not be in the set.

Yeah, it was a optimization for the most common case, when
find_decls_types_r calls add_tree_to_fld_list (t, fld); for DECL_P as well
as TYPE_P, because in that case the fld->pset.add check has been done
already by walk_tree.  If we have hundreds thousands of elts in the
hash_set, it could be more than a microoptimization, but if you think
strongly that it should be all thrown away and add_tree_to_fld_list should
start with if (fld->pset.add (t)) return; then I can do that too.

> > 2019-03-20  Jan Hubicka  
> > Jakub Jelinek  
> > 
> > PR lto/89692
> > * tree.c (fld_type_variant): Call fld->pset.add.
> > (fld_incomplete_type_of): Likewise and don't call add_tree_to_fld_list
> > if it returns true.
> > (fld_process_array_type): Likewise.
> > (free_lang_data_in_type): Purge non-marked types from TYPE_NEXT_VARIANT
> > list.
> > (find_decls_types_r): Call fld_worklist_push for TYPE_CANONICAL (t).
> > 
> > * g++.dg/other/pr89692.C: New test.

Jakub


Re: [PATCH] Allow lazy construction of hash_{table,set,map}

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 09:28:06AM +0100, Richard Biener wrote:
> > Well, this is surely more costly in that it allocates both the hash_table
> > structure and the memory pointed by it from GC.
> 
> You could use
> 
>   char constexpr_fundef_table[sizeof(hash-table)];
>   bool constexpr_fundef_table_initialized_p = false;
> 
>   if (!constexpr_fundef_table_initialized_p)
> new (constexpr_fundef_table) hash-table (...);
> 
> and hide this pattern behind a new RAII class?

Do we have a portable C++98 way to make sure the buffer has proper alignment
though?  Plus we'd need to pass around this auto_hash_set instead of
hash_set, so from the usage POV it would be like the following patches
where the functionality is embedded in hash_set instantiations itself with a
special new template parameter.

> > I guess another option would be to make the decision whether
> > the hash_{table,set,map} is constructed with allocation right away (and no
> > runtime checks for it) or if it is lazy another template argument (bool
> > Lazy = false before the Allocator template argument), if !Lazy, it would
> > work exactly as it is now, if Lazy it would not allocate it in the ctor
> > and where needed would add m_entries checks.
> 
> That's a possibility, but what about the above RAII trick?  Not sure
> if there's a way to avoid the char[] use and instead use "unconstructed"
> properly typed storage.

Attached (so far without changelog, selftest additions and only tested with
make -j32 -k check-c++-all RUNTESTFLAGS="dg.exp='pr89767.C *lambda* *desig*'"
) is
1) hash_{table,set}  implementation for lazy allocation
2) the PR c++/89767 patch with optimization for zero and one entries
3) incremental PR c++/71446 fix to simplify that code using this new
   infrastructure

For 2), because of the 0 and 1 entry optimization, we still need to test
ids->elements () == 0 and special case that case, but a version that would
optimize solely zero captures case could go without such checks, just
if (ids->add (name)) and no other ids->*.  And in 3), while the addition
doesn't use a condition, I use pset.elements () check as an optimization to
avoid calling the recursive function in the most common case where there are
no designators, pset.elements () != 0 check is just a comparison of a
subtraction of two fields against 0.

Jakub
--- gcc/hash-table.h.jj 2019-03-14 23:46:28.593616750 +0100
+++ gcc/hash-table.h2019-03-21 09:27:19.722074003 +0100
@@ -167,6 +167,15 @@ along with GCC; see the file COPYING3.
See hash_table for details.  The interface is very similar to libiberty's
htab_t.
 
+   If a hash table is used only in some rare cases, it is possible
+   to construct the hash_table lazily before first use.  This is done
+   through:
+
+  hash_table  some_type_hash_table;
+
+   which will cause whatever methods actually need the allocated entries
+   array to allocate it later.
+
 
EASY DESCRIPTORS FOR POINTERS
 
@@ -241,7 +250,7 @@ along with GCC; see the file COPYING3.
 #include "hash-map-traits.h"
 
 template class hash_map;
-template class hash_set;
+template class hash_set;
 
 /* The ordinary memory allocator.  */
 /* FIXME (crowl): This allocator may be extracted for wider sharing later.  */
@@ -353,8 +362,8 @@ class mem_usage;
  hash table code.
 
 */
-template  class Allocator = xcallocator>
+template  class Allocator = xcallocator>
 class hash_table
 {
   typedef typename Descriptor::value_type value_type;
@@ -422,7 +431,7 @@ public:
  write the value you want into the returned slot.  When inserting an
  entry, NULL may be returned if memory allocation fails. */
   value_type *find_slot_with_hash (const compare_type &comparable,
-   hashval_t hash, enum insert_option insert);
+  hashval_t hash, enum insert_option insert);
 
   /* This function deletes an element with the given COMPARABLE value
  from hash table starting with the given HASH.  If there is no
@@ -472,6 +481,8 @@ public:
 
   iterator begin () const
 {
+  if (Lazy && m_entries == NULL)
+   return iterator ();
   iterator iter (m_entries, m_entries + m_size);
   iter.slide ();
   return iter;
@@ -491,9 +502,8 @@ private:
 hashtab_entry_note_pointers (void *, void *, gt_pointer_operator, void *);
   template friend void
   gt_pch_nx (hash_map *, gt_pointer_operator, void *);
-  template friend void gt_pch_nx (hash_set *,
- gt_pointer_operator,
- void *);
+  template
+  friend void gt_pch_nx (hash_set *, gt_pointer_operator, void *);
   template friend void gt_pch_nx (hash_table *,
  gt_pointer_operator, void *);
 
@@ -566,11 +576,12 @@ extern mem_alloc_description&
 /* Support function for statistics.  */
 extern void dump_hash_table_loc_statistics (void);
 
-template class Allocator>
-

Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Jakub Jelinek wrote:

> On Thu, Mar 21, 2019 at 09:35:07AM +0100, Richard Biener wrote:
> > LGTM - any reason you only sometimes conditionalize add_tree_to_fld_list
> > on the fld->pset.add () result?  IMHO consistency should win over
> > micro-optimizing the case you know the type will not be in the set.
> 
> Yeah, it was a optimization for the most common case, when
> find_decls_types_r calls add_tree_to_fld_list (t, fld); for DECL_P as well
> as TYPE_P, because in that case the fld->pset.add check has been done
> already by walk_tree.  If we have hundreds thousands of elts in the
> hash_set, it could be more than a microoptimization, but if you think
> strongly that it should be all thrown away and add_tree_to_fld_list should
> start with if (fld->pset.add (t)) return; then I can do that too.

Yes, I think that would be better.

Richard.


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 10:11:00AM +0100, Richard Biener wrote:
> On Thu, 21 Mar 2019, Jakub Jelinek wrote:
> 
> > On Thu, Mar 21, 2019 at 09:35:07AM +0100, Richard Biener wrote:
> > > LGTM - any reason you only sometimes conditionalize add_tree_to_fld_list
> > > on the fld->pset.add () result?  IMHO consistency should win over
> > > micro-optimizing the case you know the type will not be in the set.
> > 
> > Yeah, it was a optimization for the most common case, when
> > find_decls_types_r calls add_tree_to_fld_list (t, fld); for DECL_P as well
> > as TYPE_P, because in that case the fld->pset.add check has been done
> > already by walk_tree.  If we have hundreds thousands of elts in the
> > hash_set, it could be more than a microoptimization, but if you think
> > strongly that it should be all thrown away and add_tree_to_fld_list should
> > start with if (fld->pset.add (t)) return; then I can do that too.
> 
> Yes, I think that would be better.

Actually sorry, that is not going to work, because in the two
find_decls_types_r cases it is already set in the pset by walk_tree and thus
fld->pset.add (t) will return true because it is the second addition.
And just doing fld->pset.add (t) and not testing the result is undesirable
too for the other cases; in the patch there was some set of cases where I
was sure that the type isn't already in the pset (e.g. immediately after
build_variant_type_copy or build_distinct_type_copy), but in several other
cases it is unclear if it is already in or not and I think we don't want to
add it again in those cases.

Would you prefer to have say add_tree_to_fld_list that does what it does now
and has just 2 callers in find_decls_types_r and a new
maybe_add_tree_to_fld_list that would do
  if (!fld->pset.add (t))
add_tree_to_fld_list (t, fld);
and change all the other add_tree_to_fld_list callers?

Jakub


[PATCH][GCC][AArch64] Vectorise __builtin_signbit on aarch64

2019-03-21 Thread Przemyslaw Wirkus
Hi all,

Vectorise __builtin_signbit (v4sf) with unsigned shift right vector
instruction.

Bootstrapped and tested on aarch64-none-linux-gnu.

Assembly output for:
$ aarch64-elf-gcc -S -O3 signbitv4sf.c -dp

Before patch:

foo:
adrpx3, in  // 37   [c=4 l=4]  *movdi_aarch64/12
adrpx2, out // 40   [c=4 l=4]  *movdi_aarch64/12
add x3, x3, :lo12:in// 39   [c=4 l=4]  add_losym_di
add x2, x2, :lo12:out   // 42   [c=4 l=4]  add_losym_di
mov x0, 0   // 3[c=4 l=4]  *movdi_aarch64/3
.p2align 3,,7
.L2:
ldr w1, [x3, x0]// 10   [c=16 l=4]  *zero_extendsidi2_aarch64/1
and w1, w1, -2147483648 // 11   [c=4 l=4]  andsi3/1
str w1, [x2, x0]// 16   [c=4 l=4]  *movsi_aarch64/8
add x0, x0, 4   // 17   [c=4 l=4]  *adddi3_aarch64/0
cmp x0, 4096// 19   [c=4 l=4]  cmpdi/1
bne .L2 // 20   [c=4 l=4]  condjump
ret // 50   [c=0 l=4]  *do_return

After patch:

foo:
adrpx2, in  // 36   [c=4 l=4]  *movdi_aarch64/12
adrpx1, out // 39   [c=4 l=4]  *movdi_aarch64/12
add x2, x2, :lo12:in// 38   [c=4 l=4]  add_losym_di
add x1, x1, :lo12:out   // 41   [c=4 l=4]  add_losym_di
mov x0, 0   // 3[c=4 l=4]  *movdi_aarch64/3
.p2align 3,,7
.L2:
ldr q0, [x2, x0]// 10   [c=8 l=4]  *aarch64_simd_movv4sf/0
ushrv0.4s, v0.4s, 31// 11   [c=12 l=4]  
aarch64_simd_lshrv4si
str q0, [x1, x0]// 15   [c=4 l=4]  *aarch64_simd_movv4si/2
add x0, x0, 16  // 16   [c=4 l=4]  *adddi3_aarch64/0
cmp x0, 4096// 18   [c=4 l=4]  cmpdi/1
bne .L2 // 19   [c=4 l=4]  condjump
ret // 49   [c=0 l=4]  *do_return

Thanks,
Przemyslaw

gcc/ChangeLog:

2019-03-20  Przemyslaw Wirkus  

* config/aarch64/aarch64-builtins.c
(aarch64_builtin_vectorized_function): Added CASE_CFN_SIGNBIT.
* config/aarch64/aarch64-simd-builtins.def: (signbit)
Extend to V4SF mode.
* config/aarch64/aarch64-simd.md (signbitv4sf2): New expand
defined.

gcc/testsuite/ChangeLog:

2019-02-28  Przemyslaw Wirkus  

* gcc.target/aarch64/signbitv4sf.c: New test.
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 
04063e5ed134d2e64487db23b8fa7794817b2739..86f8345848abd1515cef61824db525dc26ec9bdb
 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1709,6 +1709,13 @@ aarch64_builtin_vectorized_function (unsigned int fn, 
tree type_out,
 
return aarch64_builtin_decls[builtin];
   }
+CASE_CFN_SIGNBIT:
+  {
+   if (AARCH64_CHECK_BUILTIN_MODE (4, S))
+ return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_UNOP_signbitv4sf];
+   else
+ return NULL_TREE;
+  }
 case CFN_BUILT_IN_BSWAP16:
 #undef AARCH64_CHECK_BUILTIN_MODE
 #define AARCH64_CHECK_BUILTIN_MODE(C, N) \
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
b/gcc/config/aarch64/aarch64-simd-builtins.def
index 
17bb0c4869b12ede2fc51a8f89d841ded8fac230..d568f0ba4e61febf0590b22789b006f3bfe11ccd
 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -324,6 +324,9 @@
   VAR1 (UNOP, rint, 2, hf)
   VAR1 (UNOP, round, 2, hf)
 
+  /* Implemented by signbit2 pattern */
+  VAR1 (UNOP, signbit, 2, v4sf)
+
   /* Implemented by l2.  */
   VAR1 (UNOP, lbtruncv4hf, 2, v4hi)
   VAR1 (UNOP, lbtruncv8hf, 2, v8hi)
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
be6c27d319a1ca6fee581d8f8856a4dff8f4a060..87e2a58649c3e5d490c499115cf6b7495d448c29
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -915,6 +915,21 @@
   [(set_attr "type" "neon_ins")]
 )
 
+(define_expand "signbitv4sf2"
+[(use (match_operand:V4SI 0 "register_operand"))
+ (use (match_operand:V4SF 1 "register_operand"))]
+ "TARGET_SIMD"
+{
+  int shift_amount = GET_MODE_UNIT_BITSIZE (V4SImode) - 1;
+  rtx shift_vector = aarch64_simd_gen_const_vector_dup (V4SImode,
+  shift_amount);
+  operands[1] = lowpart_subreg (V4SImode, operands[1], V4SFmode);
+
+  emit_insn (gen_aarch64_simd_lshrv4si (operands[0], operands[1],
+  shift_vector));
+  DONE;
+})
+
 (define_insn "aarch64_simd_lshr"
  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
(lshiftrt:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w")
diff --git a/gcc/testsuite/gcc.target/aarch64/signbitv4sf.c 
b/gcc/testsuite/gcc.target/aarch64/signbitv4sf.c
new file mode 100644
index 
..aa06a5df1dbb3e295355d485b39963127a828b68
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/signbitv4sf.c
@@ -0,0 +1,35 @@
+/* { dg-do run

[PATCH] Fix PR89779

2019-03-21 Thread Richard Biener


The testcase in this PR shows that IVOPTs ends up using cached
control IV information of IVs that have been elimiated by it.
IVOPTs fails to properly release that information as scev_reset
just resets ->nb_iterations but not control-IVs.

A proper fix is to avoid doing the above for each loop and instead
delay IV def removal until IVOPTs has finished and there ensure
also possibly invalidated control-IVs are released.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

IMHO this is the proper fix, I'll test and install also a patch
restoring defensive checks that avoided the ICE before.

Slightly safe might be to prune control-IVs where we do
scev_reset () now instead of delaying DEF removal.  I'm only 99%
sure IVOPTs, when processing outer loops, is not affected by
the dead IVs being still in the IL (all uses are already updated
but feeding defs are not elimiated either).

Bin?

Richard.

2019-03-21  Richard Biener  

PR tree-optimization/89779
* tree-ssa-loop-ivopts.c (remove_unused_ivs): Return
to remove IV defs, delay actual removal.
(tree_ssa_iv_optimize_loop): Likewise.  Avoid SCEV reset.
(tree_ssa_iv_optimize): Remove eliminated IV defs at the
very end, properly also reset loop control IV information.

* gcc.dg/torture/pr89779.c: New testcase.

Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 269832)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -7258,11 +7258,10 @@ rewrite_groups (struct ivopts_data *data
 /* Removes the ivs that are not used after rewriting.  */
 
 static void
-remove_unused_ivs (struct ivopts_data *data)
+remove_unused_ivs (struct ivopts_data *data, bitmap toremove)
 {
   unsigned j;
   bitmap_iterator bi;
-  bitmap toremove = BITMAP_ALLOC (NULL);
 
   /* Figure out an order in which to release SSA DEFs so that we don't
  release something that we'd have to propagate into a debug stmt
@@ -7384,10 +7383,6 @@ remove_unused_ivs (struct ivopts_data *d
}
}
 }
-
-  release_defs_bitset (toremove);
-
-  BITMAP_FREE (toremove);
 }
 
 /* Frees memory occupied by struct tree_niter_desc in *VALUE. Callback
@@ -7530,7 +7525,8 @@ loop_body_includes_call (basic_block *bo
 /* Optimizes the LOOP.  Returns true if anything changed.  */
 
 static bool
-tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop)
+tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop,
+  bitmap toremove)
 {
   bool changed = false;
   struct iv_ca *iv_ca;
@@ -7600,12 +7596,7 @@ tree_ssa_iv_optimize_loop (struct ivopts
   rewrite_groups (data);
 
   /* Remove the ivs that are unused after rewriting.  */
-  remove_unused_ivs (data);
-
-  /* We have changed the structure of induction variables; it might happen
- that definitions in the scev database refer to some of them that were
- eliminated.  */
-  scev_reset ();
+  remove_unused_ivs (data, toremove);
 
 finish:
   free_loop_data (data);
@@ -7620,6 +7611,7 @@ tree_ssa_iv_optimize (void)
 {
   struct loop *loop;
   struct ivopts_data data;
+  auto_bitmap toremove;
 
   tree_ssa_iv_optimize_init (&data);
 
@@ -7629,9 +7621,19 @@ tree_ssa_iv_optimize (void)
   if (dump_file && (dump_flags & TDF_DETAILS))
flow_loop_dump (loop, dump_file, NULL, 1);
 
-  tree_ssa_iv_optimize_loop (&data, loop);
+  tree_ssa_iv_optimize_loop (&data, loop, toremove);
 }
 
+  /* Remove eliminated IV defs.  */
+  release_defs_bitset (toremove);
+
+  /* We have changed the structure of induction variables; it might happen
+ that definitions in the scev database refer to some of them that were
+ eliminated.  */
+  scev_reset_htab ();
+  /* Likewise niter and control-IV information.  */
+  free_numbers_of_iterations_estimates (cfun);
+
   tree_ssa_iv_optimize_finalize (&data);
 }
 
Index: gcc/testsuite/gcc.dg/torture/pr89779.c
===
--- gcc/testsuite/gcc.dg/torture/pr89779.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89779.c  (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+typedef int a;
+void h(a);
+void c(a *d, int b)
+{
+  int e, f, g;
+  for (; e; e++)
+for (f = 0; f < 4; f++)
+  if (d)
+   for (g = e + 1; g; g++)
+ h(d[g]);
+}
+void i()
+{
+  a *j;
+  int k, l;
+  for (; k; k++)
+c(j, l);
+}


[PATCH] Restore defensive check in tree_nop_conversion (also PR89779 "fix")

2019-03-21 Thread Richard Biener


This also avoids the ICE in PR89779 but IMHO is not a real fix.

Still it restores a previously active check against released SSA names
which now have error_mark_node type rather than NULL.  The new way
opens up consolidation so I've adjusted tree_nop_conversion plus
operand_equal_p which got a defensive check at the same time
(and checks for error_mark_node right before that check).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

The testcase comes with the other patch (but still included below).

Richard.

2019-03-21  Richard Biener  

PR tree-optimization/89779
* tree.c (tree_nop_conversion): Consolidate and fix defensive
checks with respect to released SSA names now having error_mark_node
type.
* fold-const.c (operand_equal_p): Likewise.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 269832)
+++ gcc/tree.c  (working copy)
@@ -12812,13 +12812,10 @@ tree_nop_conversion (const_tree exp)
   if (!CONVERT_EXPR_P (exp)
   && TREE_CODE (exp) != NON_LVALUE_EXPR)
 return false;
-  if (TREE_OPERAND (exp, 0) == error_mark_node)
-return false;
 
   outer_type = TREE_TYPE (exp);
   inner_type = TREE_TYPE (TREE_OPERAND (exp, 0));
-
-  if (!inner_type)
+  if (inner_type == error_mark_node)
 return false;
 
   return tree_nop_conversion_p (outer_type, inner_type);
Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 269832)
+++ gcc/fold-const.c(working copy)
@@ -2973,11 +2973,6 @@ operand_equal_p (const_tree arg0, const_
   || TREE_TYPE (arg1) == error_mark_node)
 return 0;
 
-  /* Similar, if either does not have a type (like a released SSA name), 
- they aren't equal.  */
-  if (!TREE_TYPE (arg0) || !TREE_TYPE (arg1))
-return 0;
-
   /* We cannot consider pointers to different address space equal.  */
   if (POINTER_TYPE_P (TREE_TYPE (arg0))
   && POINTER_TYPE_P (TREE_TYPE (arg1))
Index: gcc/testsuite/gcc.dg/torture/pr89779.c
===
--- gcc/testsuite/gcc.dg/torture/pr89779.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89779.c  (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+typedef int a;
+void h(a);
+void c(a *d, int b)
+{
+  int e, f, g;
+  for (; e; e++)
+for (f = 0; f < 4; f++)
+  if (d)
+   for (g = e + 1; g; g++)
+ h(d[g]);
+}
+void i()
+{
+  a *j;
+  int k, l;
+  for (; k; k++)
+c(j, l);
+}


Re: [C PATCH] Fix endless loop in the C FE initializer handling (PR c/85704)

2019-03-21 Thread Richard Sandiford
[Sorry for responding to such an old patch]

Jakub Jelinek  writes:
> +/* For two FIELD_DECLs in the same chain, return -1 if field1
> +   comes before field2, 1 if field1 comes after field2 and
> +   0 if field1 == field2.  */
> +
> +static int
> +field_decl_cmp (tree field1, tree field2)
> +{
> +  if (field1 == field2)
> +return 0;
> +
> +  tree bitpos1 = bit_position (field1);
> +  tree bitpos2 = bit_position (field2);
> +  if (tree_int_cst_equal (bitpos1, bitpos2))
> +{
> +  /* If one of the fields has non-zero bitsize, then that
> +  field must be the last one in a sequence of zero
> +  sized fields, fields after it will have bigger
> +  bit_position.  */
> +  if (TREE_TYPE (field1) != error_mark_node
> +   && COMPLETE_TYPE_P (TREE_TYPE (field1))
> +   && integer_nonzerop (TREE_TYPE (field1)))
> + return 1;
> +  if (TREE_TYPE (field2) != error_mark_node
> +   && COMPLETE_TYPE_P (TREE_TYPE (field2))
> +   && integer_nonzerop (TREE_TYPE (field2)))
> + return -1;

Looks like these integer_nonzerop should be testing TYPE_SIZE or
TYPE_SIZE_UNIT -- not sure which is preferred here.

Thanks,
Richard


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Jakub Jelinek wrote:

> On Thu, Mar 21, 2019 at 10:11:00AM +0100, Richard Biener wrote:
> > On Thu, 21 Mar 2019, Jakub Jelinek wrote:
> > 
> > > On Thu, Mar 21, 2019 at 09:35:07AM +0100, Richard Biener wrote:
> > > > LGTM - any reason you only sometimes conditionalize add_tree_to_fld_list
> > > > on the fld->pset.add () result?  IMHO consistency should win over
> > > > micro-optimizing the case you know the type will not be in the set.
> > > 
> > > Yeah, it was a optimization for the most common case, when
> > > find_decls_types_r calls add_tree_to_fld_list (t, fld); for DECL_P as well
> > > as TYPE_P, because in that case the fld->pset.add check has been done
> > > already by walk_tree.  If we have hundreds thousands of elts in the
> > > hash_set, it could be more than a microoptimization, but if you think
> > > strongly that it should be all thrown away and add_tree_to_fld_list should
> > > start with if (fld->pset.add (t)) return; then I can do that too.
> > 
> > Yes, I think that would be better.
> 
> Actually sorry, that is not going to work, because in the two
> find_decls_types_r cases it is already set in the pset by walk_tree and thus
> fld->pset.add (t) will return true because it is the second addition.

Oh.  I was mostly looking at the cases you add, find_decls_types_r
is indeed special.

> And just doing fld->pset.add (t) and not testing the result is undesirable
> too for the other cases; in the patch there was some set of cases where I
> was sure that the type isn't already in the pset (e.g. immediately after
> build_variant_type_copy or build_distinct_type_copy), but in several other
> cases it is unclear if it is already in or not and I think we don't want to
> add it again in those cases.

Yeah, I think there's no point to add it again.  Or is there a case
I am missing - Honza?

> Would you prefer to have say add_tree_to_fld_list that does what it does now
> and has just 2 callers in find_decls_types_r and a new
> maybe_add_tree_to_fld_list that would do
>   if (!fld->pset.add (t))
> add_tree_to_fld_list (t, fld);
> and change all the other add_tree_to_fld_list callers?

Not sure if that's needed, if you leave find_decls_types_r alone then
always doing the check should work, no?

Richard.


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 11:13:41AM +0100, Richard Biener wrote:
> > Would you prefer to have say add_tree_to_fld_list that does what it does now
> > and has just 2 callers in find_decls_types_r and a new
> > maybe_add_tree_to_fld_list that would do
> >   if (!fld->pset.add (t))
> > add_tree_to_fld_list (t, fld);
> > and change all the other add_tree_to_fld_list callers?
> 
> Not sure if that's needed, if you leave find_decls_types_r alone then
> always doing the check should work, no?

Well, find_decls_types_r uses add_tree_to_fld_list, so the if (fld->pset.add
(t)) return; can't be in that function.  If we want to simplify all the
other callers, so that they don't need to do it manually, then the only
options I see is call some other function in those spots that does this
additional check, or call another function that doesn't do that in the
two spots of find_decls_types_r, or add another argument to the function
say with default argument and test that argument before the fld->pset.add.
E.g. , bool do_add = true and if (do_add && fld->pset.add (t)) return;
and adjust the two calls in find_decls_types_r to call
add_tree_to_fld_list (t, fld, false);
Or keep the patch as is, do the tests before each but the two
add_tree_to_fld_list calls.

Or do you have yet another variant, or what from the above do you prefer?

Jakub


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Jakub Jelinek wrote:

> On Thu, Mar 21, 2019 at 11:13:41AM +0100, Richard Biener wrote:
> > > Would you prefer to have say add_tree_to_fld_list that does what it does 
> > > now
> > > and has just 2 callers in find_decls_types_r and a new
> > > maybe_add_tree_to_fld_list that would do
> > >   if (!fld->pset.add (t))
> > > add_tree_to_fld_list (t, fld);
> > > and change all the other add_tree_to_fld_list callers?
> > 
> > Not sure if that's needed, if you leave find_decls_types_r alone then
> > always doing the check should work, no?
> 
> Well, find_decls_types_r uses add_tree_to_fld_list, so the if (fld->pset.add
> (t)) return; can't be in that function.  If we want to simplify all the
> other callers, so that they don't need to do it manually, then the only
> options I see is call some other function in those spots that does this
> additional check, or call another function that doesn't do that in the
> two spots of find_decls_types_r, or add another argument to the function
> say with default argument and test that argument before the fld->pset.add.
> E.g. , bool do_add = true and if (do_add && fld->pset.add (t)) return;
> and adjust the two calls in find_decls_types_r to call
> add_tree_to_fld_list (t, fld, false);
> Or keep the patch as is, do the tests before each but the two
> add_tree_to_fld_list calls.

I'd have kept the patch but made the add_tree_to_fld_list following
the new fld->pset.add (t) calls conditional on the result.

Richard.


Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 11:25:38AM +0100, Richard Biener wrote:
> On Thu, 21 Mar 2019, Jakub Jelinek wrote:
> 
> > On Thu, Mar 21, 2019 at 11:13:41AM +0100, Richard Biener wrote:
> > > > Would you prefer to have say add_tree_to_fld_list that does what it 
> > > > does now
> > > > and has just 2 callers in find_decls_types_r and a new
> > > > maybe_add_tree_to_fld_list that would do
> > > >   if (!fld->pset.add (t))
> > > > add_tree_to_fld_list (t, fld);
> > > > and change all the other add_tree_to_fld_list callers?
> > > 
> > > Not sure if that's needed, if you leave find_decls_types_r alone then
> > > always doing the check should work, no?
> > 
> > Well, find_decls_types_r uses add_tree_to_fld_list, so the if (fld->pset.add
> > (t)) return; can't be in that function.  If we want to simplify all the
> > other callers, so that they don't need to do it manually, then the only
> > options I see is call some other function in those spots that does this
> > additional check, or call another function that doesn't do that in the
> > two spots of find_decls_types_r, or add another argument to the function
> > say with default argument and test that argument before the fld->pset.add.
> > E.g. , bool do_add = true and if (do_add && fld->pset.add (t)) return;
> > and adjust the two calls in find_decls_types_r to call
> > add_tree_to_fld_list (t, fld, false);
> > Or keep the patch as is, do the tests before each but the two
> > add_tree_to_fld_list calls.
> 
> I'd have kept the patch but made the add_tree_to_fld_list following
> the new fld->pset.add (t) calls conditional on the result.

Ok, so incremental diff:
--- gcc/tree.c  2019-03-20 20:16:12.277091827 +0100
+++ gcc/tree.c  2019-03-21 11:33:17.008936452 +0100
@@ -5215,8 +5215,8 @@
   if (inner_type)
 TREE_TYPE (v) = inner_type;
   gcc_checking_assert (fld_type_variant_equal_p (t,v, inner_type));
-  fld->pset.add (v);
-  add_tree_to_fld_list (v, fld);
+  if (!fld->pset.add (v))
+add_tree_to_fld_list (v, fld);
   return v;
 }
 
@@ -5324,8 +5324,8 @@
  copy = build_distinct_type_copy (t);
 
  /* It is possible that type was not seen by free_lang_data yet.  */
- fld->pset.add (copy);
- add_tree_to_fld_list (copy, fld);
+ if (!fld->pset.add (copy))
+   add_tree_to_fld_list (copy, fld);
  TYPE_SIZE (copy) = NULL;
  TYPE_USER_ALIGN (copy) = 0;
  TYPE_SIZE_UNIT (copy) = NULL;
@@ -5480,8 +5480,8 @@
  & ~TYPE_QUAL_CONST
  & ~TYPE_QUAL_VOLATILE;
  TREE_VALUE (p) = build_qualified_type (arg_type, quals);
- fld->pset.add (TREE_VALUE (p));
- free_lang_data_in_type (TREE_VALUE (p), fld);
+ if (!fld->pset.add (TREE_VALUE (p)))
+   free_lang_data_in_type (TREE_VALUE (p), fld);
}
  /* C++ FE uses TREE_PURPOSE to store initial values.  */
  TREE_PURPOSE (p) = NULL;

and full patch below?  Ok for trunk if it passes bootstrap/regtest?

2019-03-21  Jan Hubicka  
Jakub Jelinek  

PR lto/89692
* tree.c (fld_type_variant, fld_incomplete_type_of,
fld_process_array_type): Call fld->pset.add and don't call
add_tree_to_fld_list if it returns true.
(free_lang_data_in_type): Likewise.  Purge non-marked types from
TYPE_NEXT_VARIANT list.
(find_decls_types_r): Call fld_worklist_push for TYPE_CANONICAL (t).

* g++.dg/other/pr89692.C: New test.

--- gcc/tree.c.jj   2019-03-20 12:24:56.44338 +0100
+++ gcc/tree.c  2019-03-21 11:33:17.008936452 +0100
@@ -5215,7 +5215,8 @@ fld_type_variant (tree first, tree t, st
   if (inner_type)
 TREE_TYPE (v) = inner_type;
   gcc_checking_assert (fld_type_variant_equal_p (t,v, inner_type));
-  add_tree_to_fld_list (v, fld);
+  if (!fld->pset.add (v))
+add_tree_to_fld_list (v, fld);
   return v;
 }
 
@@ -5253,7 +5254,8 @@ fld_process_array_type (tree t, tree t2,
   array = build_array_type_1 (t2, TYPE_DOMAIN (t),
  TYPE_TYPELESS_STORAGE (t), false);
   TYPE_CANONICAL (array) = TYPE_CANONICAL (t);
-  add_tree_to_fld_list (array, fld);
+  if (!fld->pset.add (array))
+   add_tree_to_fld_list (array, fld);
 }
   return array;
 }
@@ -5298,7 +5300,8 @@ fld_incomplete_type_of (tree t, struct f
TYPE_REF_CAN_ALIAS_ALL (t));
  gcc_assert (TYPE_CANONICAL (t2) != t2
  && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
- add_tree_to_fld_list (first, fld);
+ if (!fld->pset.add (first))
+   add_tree_to_fld_list (first, fld);
  return fld_type_variant (first, t, fld);
}
   return t;
@@ -5321,7 +5324,8 @@ fld_incomplete_type_of (tree t, struct f
  copy = build_distinct_type_copy (t);
 
  /* It is possible that type was not seen by free_lang_data yet.  */
- add_tree_to_fl

Re: [PATCH] free_lang_data fixes (PR lto/89692)

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Jakub Jelinek wrote:

> On Thu, Mar 21, 2019 at 11:25:38AM +0100, Richard Biener wrote:
> > On Thu, 21 Mar 2019, Jakub Jelinek wrote:
> > 
> > > On Thu, Mar 21, 2019 at 11:13:41AM +0100, Richard Biener wrote:
> > > > > Would you prefer to have say add_tree_to_fld_list that does what it 
> > > > > does now
> > > > > and has just 2 callers in find_decls_types_r and a new
> > > > > maybe_add_tree_to_fld_list that would do
> > > > >   if (!fld->pset.add (t))
> > > > > add_tree_to_fld_list (t, fld);
> > > > > and change all the other add_tree_to_fld_list callers?
> > > > 
> > > > Not sure if that's needed, if you leave find_decls_types_r alone then
> > > > always doing the check should work, no?
> > > 
> > > Well, find_decls_types_r uses add_tree_to_fld_list, so the if 
> > > (fld->pset.add
> > > (t)) return; can't be in that function.  If we want to simplify all the
> > > other callers, so that they don't need to do it manually, then the only
> > > options I see is call some other function in those spots that does this
> > > additional check, or call another function that doesn't do that in the
> > > two spots of find_decls_types_r, or add another argument to the function
> > > say with default argument and test that argument before the fld->pset.add.
> > > E.g. , bool do_add = true and if (do_add && fld->pset.add (t)) return;
> > > and adjust the two calls in find_decls_types_r to call
> > > add_tree_to_fld_list (t, fld, false);
> > > Or keep the patch as is, do the tests before each but the two
> > > add_tree_to_fld_list calls.
> > 
> > I'd have kept the patch but made the add_tree_to_fld_list following
> > the new fld->pset.add (t) calls conditional on the result.
> 
> Ok, so incremental diff:
> --- gcc/tree.c2019-03-20 20:16:12.277091827 +0100
> +++ gcc/tree.c2019-03-21 11:33:17.008936452 +0100
> @@ -5215,8 +5215,8 @@
>if (inner_type)
>  TREE_TYPE (v) = inner_type;
>gcc_checking_assert (fld_type_variant_equal_p (t,v, inner_type));
> -  fld->pset.add (v);
> -  add_tree_to_fld_list (v, fld);
> +  if (!fld->pset.add (v))
> +add_tree_to_fld_list (v, fld);
>return v;
>  }
>  
> @@ -5324,8 +5324,8 @@
> copy = build_distinct_type_copy (t);
>  
> /* It is possible that type was not seen by free_lang_data yet.  */
> -   fld->pset.add (copy);
> -   add_tree_to_fld_list (copy, fld);
> +   if (!fld->pset.add (copy))
> + add_tree_to_fld_list (copy, fld);
> TYPE_SIZE (copy) = NULL;
> TYPE_USER_ALIGN (copy) = 0;
> TYPE_SIZE_UNIT (copy) = NULL;
> @@ -5480,8 +5480,8 @@
> & ~TYPE_QUAL_CONST
> & ~TYPE_QUAL_VOLATILE;
> TREE_VALUE (p) = build_qualified_type (arg_type, quals);
> -   fld->pset.add (TREE_VALUE (p));
> -   free_lang_data_in_type (TREE_VALUE (p), fld);
> +   if (!fld->pset.add (TREE_VALUE (p)))
> + free_lang_data_in_type (TREE_VALUE (p), fld);
>   }
> /* C++ FE uses TREE_PURPOSE to store initial values.  */
> TREE_PURPOSE (p) = NULL;
> 
> and full patch below?  Ok for trunk if it passes bootstrap/regtest?

OK.

Thanks,
Richard.

> 2019-03-21  Jan Hubicka  
>   Jakub Jelinek  
> 
>   PR lto/89692
>   * tree.c (fld_type_variant, fld_incomplete_type_of,
>   fld_process_array_type): Call fld->pset.add and don't call
>   add_tree_to_fld_list if it returns true.
>   (free_lang_data_in_type): Likewise.  Purge non-marked types from
>   TYPE_NEXT_VARIANT list.
>   (find_decls_types_r): Call fld_worklist_push for TYPE_CANONICAL (t).
> 
>   * g++.dg/other/pr89692.C: New test.
> 
> --- gcc/tree.c.jj 2019-03-20 12:24:56.44338 +0100
> +++ gcc/tree.c2019-03-21 11:33:17.008936452 +0100
> @@ -5215,7 +5215,8 @@ fld_type_variant (tree first, tree t, st
>if (inner_type)
>  TREE_TYPE (v) = inner_type;
>gcc_checking_assert (fld_type_variant_equal_p (t,v, inner_type));
> -  add_tree_to_fld_list (v, fld);
> +  if (!fld->pset.add (v))
> +add_tree_to_fld_list (v, fld);
>return v;
>  }
>  
> @@ -5253,7 +5254,8 @@ fld_process_array_type (tree t, tree t2,
>array = build_array_type_1 (t2, TYPE_DOMAIN (t),
> TYPE_TYPELESS_STORAGE (t), false);
>TYPE_CANONICAL (array) = TYPE_CANONICAL (t);
> -  add_tree_to_fld_list (array, fld);
> +  if (!fld->pset.add (array))
> + add_tree_to_fld_list (array, fld);
>  }
>return array;
>  }
> @@ -5298,7 +5300,8 @@ fld_incomplete_type_of (tree t, struct f
>   TYPE_REF_CAN_ALIAS_ALL (t));
> gcc_assert (TYPE_CANONICAL (t2) != t2
> && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
> -   add_tree_to_fld_list (first, fld);
> +   if (!fld->pset.add (first))
> + add_tree_to_fld_list (first, fld);
> return fld_type_variant (first, t, fld);
> 

Re: [PATCHv2] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)

2019-03-21 Thread Richard Biener
On Sun, 10 Mar 2019, Bernd Edlinger wrote:

> Hi,
> 
> This patch is an update to the previous patch, which takes into account that
> the middle-end is not supposed to use the unaligned DI value directly which
> was passed in an unaligned stack slot due to the AAPCS parameter passing 
> rules.
> 
> The patch works by changing use_register_for_decl to return false if the
> incoming RTL is not sufficiently aligned on a STRICT_ALIGNMENT target,
> as if the address of the parameter was taken (which is TREE_ADDRESSABLE).
> So not taking the address of the parameter is a necessary condition
> for the wrong-code in PR 89544.
> 
> It works together with this check in assign_parm_adjust_stack_rtl:
>   /* If we can't trust the parm stack slot to be aligned enough for its
>  ultimate type, don't use that slot after entry.  We'll make another
>  stack slot, if we need one.  */
>   if (stack_parm
>   && ((STRICT_ALIGNMENT
>&& GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN 
> (stack_parm))
> ...
> stack_param = NULL
> 
> This makes assign_parms use assign_parm_setup_stack instead of
> assign_parm_setup_reg, and the latter does assign a suitably aligned
> stack slot, because stack_param == NULL by now, and uses emit_block_move
> which avoids the issue with the back-end.
> 
> Additionally, to prevent unnecessary performance regressions,
> assign_parm_find_stack_rtl is enhanced to make use of a possible larger
> alignment if the parameter was passed in an aligned stack slot without
> the ABI requiring it.
> 
> 
> Bootstrapped and reg-tested with all languages on arm-linux-gnueabihf.
> Is it OK for trunk?

I think the assign_parm_find_stack_rtl is not appropriate at this stage,
I am also missing an update to the comment of the block you change.
It also changes code I'm not familar enough with to review...

Finally...

Index: gcc/function.c
===
--- gcc/function.c  (revision 269264)
+++ gcc/function.c  (working copy)
@@ -2210,6 +2210,12 @@ use_register_for_decl (const_tree decl)
   if (DECL_MODE (decl) == BLKmode)
 return false;

+  if (STRICT_ALIGNMENT && TREE_CODE (decl) == PARM_DECL
+  && DECL_INCOMING_RTL (decl) && MEM_P (DECL_INCOMING_RTL (decl))
+  && GET_MODE_ALIGNMENT (DECL_MODE (decl))
+> MEM_ALIGN (DECL_INCOMING_RTL (decl)))
+return false;
+
   /* If -ffloat-store specified, don't put explicit float variables
  into registers.  */
   /* ??? This should be checked after DECL_ARTIFICIAL, but tree-ssa

I wonder if it is necessary to look at DECL_INCOMING_RTL here
and why such RTL may not exist?  That is, iff DECL_INCOMING_RTL
doesn't exist then shouldn't we return false for safety reasons?

Similarly the very same issue should exist on x86_64 which is
!STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate
alignment on the caller side.  So the STRICT_ALIGNMENT check is
a wrong one.

Which makes me think that a proper fix is not here, but in
target(hook) code.

Changing use_register_for_decl sounds odd anyways since if we return true
we for the testcase still end up in memory, no?

The hunk obviously misses a comment since the effect that this
will cause a copy to be emitted isn't obvious (and relying on
this probably fragile).

Richard.


[patch] PR jit/87808: Allow libgccjit to work without an external gcc driver

2019-03-21 Thread Matthias Klose
Fix PR jit/87808, the embedded driver still needing the external gcc driver to
find the gcc_lib_dir. This can happen in a packaging context when libgccjit
doesn't depend on the gcc package, but just on binutils and libgcc-dev packages.
libgccjit probably could use /proc/self/maps to find the gcc_lib_dir, but that
doesn't seem to be very portable.

Ok for the trunk and the branches?

Matthias
# DP: Fix PR jit/87808.

	* Make-lang.in (CFLAGS-jit/jit-playback.o): Pass fallback
	  GCC_EXEC_PREFIX.
	* jit-playback.c (invoke_embedded_driver): Use fallback GCC_EXEC_PREFIX
	  when external driver is not present.

--- a/src/gcc/jit/Make-lang.in
+++ b/src/gcc/jit/Make-lang.in
@@ -84,6 +84,9 @@
 	jit/jit-spec.o \
 	gcc.o
 
+CFLAGS-jit/jit-playback.o += \
+	-DFALLBACK_GCC_EXEC_PREFIX=\"$(libdir)/gcc/$(target_subdir)/$(version)\"
+
 # Use strict warnings for this front end.
 jit-warn = $(STRICT_WARN)
 
--- a/src/gcc/jit/jit-playback.c
+++ b/src/gcc/jit/jit-playback.c
@@ -39,6 +39,7 @@
 #include "opt-suggestions.h"
 #include "gcc.h"
 #include "diagnostic.h"
+#include "file-find.h"
 
 #include 
 
@@ -2482,7 +2483,31 @@
 playback::context::
 invoke_embedded_driver (const vec  *argvec)
 {
+  static char* gcc_driver_file = NULL;
+
   JIT_LOG_SCOPE (get_logger ());
+
+  /* process_command() uses make_relative_prefix(), searches PATH
+ for the external driver, which might not be found.  In this case
+ fall back to the configured default.  */
+#ifdef FALLBACK_GCC_EXEC_PREFIX
+  if (gcc_driver_file == NULL && ::getenv ("GCC_EXEC_PREFIX") == NULL)
+{
+  struct path_prefix path;
+
+  prefix_from_env ("PATH", &path);
+  gcc_driver_file = find_a_file (&path, gcc_driver_name, X_OK);
+  if (gcc_driver_file == NULL)
+{
+	  char *str = concat ("GCC_EXEC_PREFIX=",
+			  FALLBACK_GCC_EXEC_PREFIX, NULL);
+	  ::putenv (str);
+	  log ("gcc driver %s not found, using fallback GCC_EXEC_PREFIX=%s",
+	   gcc_driver_name, FALLBACK_GCC_EXEC_PREFIX);
+}
+}
+#endif
+
   driver d (true, /* can_finalize */
 	false); /* debug */
   int result = d.main (argvec->length (),


Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Jonathan Wakely

On 20/03/19 14:05 -0700, Thomas Rodgers wrote:



Fixed a failing test.


Thanks. Apart from the changelog issue I mentioned on IRC, the only
other required change I see is in include/Makefile.am:

@@ -1480,7 +1512,9 @@ install-headers:
$(mkinstalldirs) $(DESTDIR)${host_installdir}/../ext
for file in ${ext_host_headers}; do \
  $(INSTALL_DATA) $${file} $(DESTDIR)${host_installdir}/../ext; done
-
+   $(mkinstalldirs) ($DESTDIR)${gxx_include_dir}/${pstl_builddir}

This needs to be $(DESTDIR) not ($DESTDIR).

***

I also see a couple of things that we probably can't fix for now, but
I'll mention them anyway for later consideration.

In  there's a local include using ""
instead of <> (see PR libstdc++/88066 for a similar case I changed
recently):

#include "execution_defs.h"

We could fix that one, but then *all* the relative includes in
include/pstl/*  use "" too. Do you think upstream would change to use
#include  instead of #include "blah.h" ? It looks like
upstream uses  instead so maybe that's something to
change in your script to import the upstream code.

Even if we fix it, TBB headers use "" instead of <>.  And it might not
be a problem for any compiler except GCC, because only we support the
silly -I- option that makes this matter.


I also noticed that simply including  (even if you don't
use anything from it) will cause a link-time dependency on TBB:

/usr/bin/ld: /tmp/ccd7w5Oo.o: in function 
`tbb::interface7::task_arena::current_thread_index()':
/usr/include/tbb/task_arena.h:358: undefined reference to 
`tbb::interface7::internal::task_arena_base::internal_current_slot()'
collect2: error: ld returned 1 exit status

This dependency comes from  which is included by
 which is included by .

I think that's OK for now, as no existing code using GCC includes the
 header (because it doesn't exist yet). It would be good if
we can find a way to solve that eventually though. Maybe the eventual
solution will just be a non-TBB backend using OpenMP.

Exporting something like this from libstdc++.so would work, but is
disgusting (and presumably "tbb::interface7" changes from release to
release?"):

extern "C" [[__gnu__::__weak__]]
int
_ZN3tbb10interface78internal15task_arena_base21internal_current_slotEv()
{ return -1; }



Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Jonathan Wakely

On 21/03/19 11:37 +, Jonathan Wakely wrote:

Exporting something like this from libstdc++.so would work, but is
disgusting (and presumably "tbb::interface7" changes from release to
release?"):

extern "C" [[__gnu__::__weak__]]
int
_ZN3tbb10interface78internal15task_arena_base21internal_current_slotEv()
{ return -1; }


Actually looking at TBB, -2 would make sense (as that's the value of
task_arena::not_initialized), but I don't think we want to do this
anyway.





RFA: Patch to fix severe recursion in d_count_templates_scopes (PR 89394)

2019-03-21 Thread Nick Clifton
Hi Ian,

  Attached is a proposed patch to fix PR 89394, which contains an
  artificial mangled name that triggers excessive recursion in
  d_count_templates_scopes.  The patch uses the same recursion limit
  that is already in place for d_print_comp, which I hope will be
  acceptable.

  There is one frag in the patch which is not directly related to this
  recursion problem however.  It extends the check in
  cplus_demangle_fill_name so that names with a negative length are
  rejected.  I had originally thought that the excessive recursion was
  due to a negative length string, although further investigation proved
  this guess to be wrong.  I felt that leaving the check in however
  would still be a good idea.

  Tested with no regressions with an x86_64-linux-gnu toolchain, as well
  as against the testcase in PR 89394.

  OK to apply ?

Cheers
  Nick

libiberty/ChangeLog
2019-03-21  Nick Clifton  

PR 89394
* cp-demangle.c (cplus_demangle_fill_name): Reject negative
lengths.
(d_count_templates_scopes): Replace num_templates and num_scopes
parameters with a struct d_print_info pointer parameter.  Adjust
body of the function accordingly.  Add recursion counter and check
that the recursion limit is not reached.
(d_print_init): Pass dpi parameter to d_count_templates_scopes.
Reset recursion counter afterwards, unless the recursion limit was
reached.

Index: libiberty/cp-demangle.c
===
--- libiberty/cp-demangle.c	(revision 269832)
+++ libiberty/cp-demangle.c	(working copy)
@@ -861,7 +861,7 @@
 int
 cplus_demangle_fill_name (struct demangle_component *p, const char *s, int len)
 {
-  if (p == NULL || s == NULL || len == 0)
+  if (p == NULL || s == NULL || len <= 0)
 return 0;
   p->d_printing = 0;
   p->type = DEMANGLE_COMPONENT_NAME;
@@ -4061,7 +4061,7 @@
are larger than the actual numbers encountered.  */
 
 static void
-d_count_templates_scopes (int *num_templates, int *num_scopes,
+d_count_templates_scopes (struct d_print_info *dpi,
 			  const struct demangle_component *dc)
 {
   if (dc == NULL)
@@ -4081,13 +4081,13 @@
   break;
 
 case DEMANGLE_COMPONENT_TEMPLATE:
-  (*num_templates)++;
+  dpi->num_copy_templates++;
   goto recurse_left_right;
 
 case DEMANGLE_COMPONENT_REFERENCE:
 case DEMANGLE_COMPONENT_RVALUE_REFERENCE:
   if (d_left (dc)->type == DEMANGLE_COMPONENT_TEMPLATE_PARAM)
-	(*num_scopes)++;
+	dpi->num_saved_scopes++;
   goto recurse_left_right;
 
 case DEMANGLE_COMPONENT_QUAL_NAME:
@@ -4152,42 +4152,42 @@
 case DEMANGLE_COMPONENT_TAGGED_NAME:
 case DEMANGLE_COMPONENT_CLONE:
 recurse_left_right:
-  d_count_templates_scopes (num_templates, num_scopes,
-d_left (dc));
-  d_count_templates_scopes (num_templates, num_scopes,
-d_right (dc));
+  /* PR 89394 - Check for too much recursion.  */
+  if (dpi->recursion > DEMANGLE_RECURSION_LIMIT)
+	/* FIXME: There ought to be a way to report to the
+	   user that the recursion limit has been reached.  */
+	return;
+
+  ++ dpi->recursion;
+  d_count_templates_scopes (dpi, d_left (dc));
+  d_count_templates_scopes (dpi, d_right (dc));
+  -- dpi->recursion;
   break;
 
 case DEMANGLE_COMPONENT_CTOR:
-  d_count_templates_scopes (num_templates, num_scopes,
-dc->u.s_ctor.name);
+  d_count_templates_scopes (dpi, dc->u.s_ctor.name);
   break;
 
 case DEMANGLE_COMPONENT_DTOR:
-  d_count_templates_scopes (num_templates, num_scopes,
-dc->u.s_dtor.name);
+  d_count_templates_scopes (dpi, dc->u.s_dtor.name);
   break;
 
 case DEMANGLE_COMPONENT_EXTENDED_OPERATOR:
-  d_count_templates_scopes (num_templates, num_scopes,
-dc->u.s_extended_operator.name);
+  d_count_templates_scopes (dpi, dc->u.s_extended_operator.name);
   break;
 
 case DEMANGLE_COMPONENT_FIXED_TYPE:
-  d_count_templates_scopes (num_templates, num_scopes,
-dc->u.s_fixed.length);
+  d_count_templates_scopes (dpi, dc->u.s_fixed.length);
   break;
 
 case DEMANGLE_COMPONENT_GLOBAL_CONSTRUCTORS:
 case DEMANGLE_COMPONENT_GLOBAL_DESTRUCTORS:
-  d_count_templates_scopes (num_templates, num_scopes,
-d_left (dc));
+  d_count_templates_scopes (dpi, d_left (dc));
   break;
 
 case DEMANGLE_COMPONENT_LAMBDA:
 case DEMANGLE_COMPONENT_DEFAULT_ARG:
-  d_count_templates_scopes (num_templates, num_scopes,
-dc->u.s_unary_num.sub);
+  d_count_templates_scopes (dpi, dc->u.s_unary_num.sub);
   break;
 }
 }
@@ -4222,8 +4222,12 @@
   dpi->next_copy_template = 0;
   dpi->num_copy_templates = 0;
 
-  d_count_templates_scopes (&dpi->num_copy_templates,
-			&dpi->num_saved_scopes, dc);
+  d_count_templates_scopes (dpi, dc);
+  /* If we did not reach the recursion limit, then reset the
+ current recursion value back to

[PATCH] PR libstdc++/88066 Use <> for includes not ""

2019-03-21 Thread Jonathan Wakely

These headers were missed in the previous commit for this bug.

There are also several "" includes in the profile mode headers, but
because they're deprecated I'm not fixing them.

* include/backward/hash_map: Use <> for includes not "".
* include/backward/hash_set: Likewise.
* include/backward/strstream: Likewise.
* include/tr1/bessel_function.tcc: Likewise.
* include/tr1/exp_integral.tcc: Likewise.
* include/tr1/legendre_function.tcc: Likewise.
* include/tr1/modified_bessel_func.tcc: Likewise.
* include/tr1/riemann_zeta.tcc: Likewise.

Tested powerpc64le-linux, committed to trunk.

commit 45ef76f8578f5e812ed096edcf8095404c0d0292
Author: Jonathan Wakely 
Date:   Thu Mar 21 11:06:50 2019 +

PR libstdc++/88066 Use <> for includes not ""

These headers were missed in the previous commit for this bug.

There are also several "" includes in the profile mode headers, but
because they're deprecated I'm not fixing them.

* include/backward/hash_map: Use <> for includes not "".
* include/backward/hash_set: Likewise.
* include/backward/strstream: Likewise.
* include/tr1/bessel_function.tcc: Likewise.
* include/tr1/exp_integral.tcc: Likewise.
* include/tr1/legendre_function.tcc: Likewise.
* include/tr1/modified_bessel_func.tcc: Likewise.
* include/tr1/riemann_zeta.tcc: Likewise.

diff --git a/libstdc++-v3/include/backward/hash_map 
b/libstdc++-v3/include/backward/hash_map
index 66a5218fc6f..e4fb32bd7ad 100644
--- a/libstdc++-v3/include/backward/hash_map
+++ b/libstdc++-v3/include/backward/hash_map
@@ -57,7 +57,7 @@
 #define _BACKWARD_HASH_MAP 1
 
 #ifndef _GLIBCXX_PERMIT_BACKWARD_HASH
-#include "backward_warning.h"
+#include 
 #endif
 
 #include 
diff --git a/libstdc++-v3/include/backward/hash_set 
b/libstdc++-v3/include/backward/hash_set
index 2dc0ed774fa..1445aa61e11 100644
--- a/libstdc++-v3/include/backward/hash_set
+++ b/libstdc++-v3/include/backward/hash_set
@@ -57,7 +57,7 @@
 #define _BACKWARD_HASH_SET 1
 
 #ifndef _GLIBCXX_PERMIT_BACKWARD_HASH
-#include "backward_warning.h"
+#include 
 #endif
 
 #include 
diff --git a/libstdc++-v3/include/backward/strstream 
b/libstdc++-v3/include/backward/strstream
index c3bc23a9cd1..f2e74362e2e 100644
--- a/libstdc++-v3/include/backward/strstream
+++ b/libstdc++-v3/include/backward/strstream
@@ -47,7 +47,7 @@
 #ifndef _BACKWARD_STRSTREAM
 #define _BACKWARD_STRSTREAM
 
-#include "backward_warning.h"
+#include 
 #include 
 #include 
 #include 
diff --git a/libstdc++-v3/include/tr1/bessel_function.tcc 
b/libstdc++-v3/include/tr1/bessel_function.tcc
index b51ff76b4af..2e738f92163 100644
--- a/libstdc++-v3/include/tr1/bessel_function.tcc
+++ b/libstdc++-v3/include/tr1/bessel_function.tcc
@@ -50,7 +50,7 @@
 #ifndef _GLIBCXX_TR1_BESSEL_FUNCTION_TCC
 #define _GLIBCXX_TR1_BESSEL_FUNCTION_TCC 1
 
-#include "special_function_util.h"
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/tr1/exp_integral.tcc 
b/libstdc++-v3/include/tr1/exp_integral.tcc
index b73e3d4ae76..246341862a6 100644
--- a/libstdc++-v3/include/tr1/exp_integral.tcc
+++ b/libstdc++-v3/include/tr1/exp_integral.tcc
@@ -45,7 +45,7 @@
 #ifndef _GLIBCXX_TR1_EXP_INTEGRAL_TCC
 #define _GLIBCXX_TR1_EXP_INTEGRAL_TCC 1
 
-#include "special_function_util.h"
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/tr1/legendre_function.tcc 
b/libstdc++-v3/include/tr1/legendre_function.tcc
index f782907a67d..c5b49d8152b 100644
--- a/libstdc++-v3/include/tr1/legendre_function.tcc
+++ b/libstdc++-v3/include/tr1/legendre_function.tcc
@@ -44,7 +44,7 @@
 #ifndef _GLIBCXX_TR1_LEGENDRE_FUNCTION_TCC
 #define _GLIBCXX_TR1_LEGENDRE_FUNCTION_TCC 1
 
-#include "special_function_util.h"
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/tr1/modified_bessel_func.tcc 
b/libstdc++-v3/include/tr1/modified_bessel_func.tcc
index ee213bfe28d..12b1f5ced89 100644
--- a/libstdc++-v3/include/tr1/modified_bessel_func.tcc
+++ b/libstdc++-v3/include/tr1/modified_bessel_func.tcc
@@ -46,7 +46,7 @@
 #ifndef _GLIBCXX_TR1_MODIFIED_BESSEL_FUNC_TCC
 #define _GLIBCXX_TR1_MODIFIED_BESSEL_FUNC_TCC 1
 
-#include "special_function_util.h"
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/tr1/riemann_zeta.tcc 
b/libstdc++-v3/include/tr1/riemann_zeta.tcc
index f573e99bb6b..4e01b8b45c3 100644
--- a/libstdc++-v3/include/tr1/riemann_zeta.tcc
+++ b/libstdc++-v3/include/tr1/riemann_zeta.tcc
@@ -42,7 +42,7 @@
 #ifndef _GLIBCXX_TR1_RIEMANN_ZETA_TCC
 #define _GLIBCXX_TR1_RIEMANN_ZETA_TCC 1
 
-#include "special_function_util.h"
+#include 
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {


Re: [PATCH PR81740]Enforce dependence check for outer loop vectorization

2019-03-21 Thread Richard Biener
On Mon, Dec 18, 2017 at 1:37 PM Richard Biener
 wrote:
>
> On Fri, Dec 15, 2017 at 7:39 PM, Bin.Cheng  wrote:
> > On Fri, Dec 15, 2017 at 1:19 PM, Richard Biener
> >  wrote:
> >> On Fri, Dec 15, 2017 at 1:35 PM, Bin.Cheng  wrote:
> >>> On Fri, Dec 15, 2017 at 12:09 PM, Bin.Cheng  wrote:
>  On Fri, Dec 15, 2017 at 11:55 AM, Richard Biener
>   wrote:
> > On Fri, Dec 15, 2017 at 12:30 PM, Bin Cheng  wrote:
> >> Hi,
> >> As explained in the PR, given below test case:
> >> int a[8][10] = { [2][5] = 4 }, c;
> >>
> >> int
> >> main ()
> >> {
> >>   short b;
> >>   int i, d;
> >>   for (b = 4; b >= 0; b--)
> >> for (c = 0; c <= 6; c++)
> >>   a[c + 1][b + 2] = a[c][b + 1];
> >>   for (i = 0; i < 8; i++)
> >> for (d = 0; d < 10; d++)
> >>   if (a[i][d] != (i == 3 && d == 6) * 4)
> >> __builtin_abort ();
> >>   return 0;
> >>
> >> the loop nest is illegal for vectorization without reversing inner 
> >> loop.  The issue
> >> is in data dependence checking of vectorizer, I believe the mentioned 
> >> revision just
> >> exposed this.  Previously the vectorization is skipped because of 
> >> unsupported memory
> >> operation.  The outer loop vectorization unrolls the outer loop into:
> >>
> >>   for (b = 4; b > 0; b -= 4)
> >>   {
> >> for (c = 0; c <= 6; c++)
> >>   a[c + 1][6] = a[c][5];
> >> for (c = 0; c <= 6; c++)
> >>   a[c + 1][5] = a[c][4];
> >> for (c = 0; c <= 6; c++)
> >>   a[c + 1][4] = a[c][3];
> >> for (c = 0; c <= 6; c++)
> >>   a[c + 1][3] = a[c][2];
> >>   }
> >> Then four inner loops are fused into:
> >>   for (b = 4; b > 0; b -= 4)
> >>   {
> >> for (c = 0; c <= 6; c++)
> >> {
> >>   a[c + 1][6] = a[c][5];  // S1
> >>   a[c + 1][5] = a[c][4];  // S2
> >>   a[c + 1][4] = a[c][3];
> >>   a[c + 1][3] = a[c][2];
> >> }
> >>   }
> >
> > Note that they are not really "fused" but they are interleaved.  With
> > GIMPLE in mind
> > that makes a difference, you should get the equivalent of
> >
> >for (c = 0; c <= 6; c++)
> >  {
> >tem1 = a[c][5];
> >tem2 = a[c][4];
> >tem3 = a[c][3];
> >tem4 = a[c][2];
> >a[c+1][6] = tem1;
> >a[c +1][5] = tem2;
> > a[c+1][4] = tem3;
> > a[c+1][3] = tem4;
> >  }
>  Yeah, I will double check if this abstract breaks the patch and how.
> >>> Hmm, I think this doesn't break it, well at least for part of the
> >>> analysis, because it is loop carried (backward) dependence goes wrong,
> >>> interleaving or not with the same iteration doesn't matter here.
> >>
> >> I think the idea is that forward dependences are always fine (negative 
> >> distance)
> >> to vectorize.  But with backward dependences we have to adhere to max_vf.
> >>
> >> It looks like for outer loop vectorization we only look at the distances 
> >> in the
> >> outer loop but never at inner ones.  But here the same applies but isn't 
> >> that
> >> independend on the distances with respect to the outer loop?
> >>
> >> But maybe I'm misunderstanding how "distances" work here.
> > Hmm, I am not sure I understand "distance" correctly.  With
> > description as in book like "Optimizing compilers for Modern
> > Architectures", distance is "# of iteration of sink ref - # of
> > iteration of source ref".  Given below example:
> >   for (i = 0; i < N; ++i)
> > {
> >   x = arr[idx_1];  // S1
> >   arr[idx_2] = x;  // S2
> > }
> > if S1 is source ref, distance = idx_2 - idx_1, and distance > 0.  Also
> > this is forward dependence.  For example, idx_1 is i + 1 and idx_2 is
> > i;
> > If S2 is source ref, distance = idx_1 - idx_2, and distance < 0.  Also
> > this is backward dependence.  For example idx_1 is i and idx_2 is i +
> > 1;
> >
> > In GCC, we always try to subtract idx_2 from idx_1 first in computing
> > classic distance, we could result in negative distance in case of
> > backward dependence.  When this happens at dependence carried level,
> > we manually reverse it.  When this happens at inner level loop, we
> > simply keep the negative distance.  And it's meaningless to talk about
> > forward/backward given dependence is carried at outer level loop.
> >
> > Outer loop vectorization is interesting.  The problematic case has
> > backward dependence carried by outer loop.  Because we don't check
> > dist vs. max_vf for such dep, the unrolled references could have outer
> > loop index equal to each other, as in a[c][5] and a[c+1][5].  So it's
> > like we have outer loop index resolved as equal.  Now it has
> > dependence only if c == c' + 1.  I think previous comment on fusion
> > still hold for this and we now need to check backward dependence
> > between the two reference at inner 

Re: [PATCH] Fix PR89505

2019-03-21 Thread Matthias Klose
On 26.02.19 15:06, Richard Biener wrote:
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk 
> sofar.

That was backported to the gcc-8 branch, and now Richard approved the backport
the gcc-7 branch.

Matthias


Re: [PATCH] make ggc pick up comp_dir_string() cache value. (dwarf2out.c)

2019-03-21 Thread Otto, Thomas
> > > > > > "ggc_collect() discarding/reusing remap_debug_filename() output,
> > > > > > thus producing invalid objects"
> > > > >
> > > > > Hmm, but AFAICS it can end up on the heap if plain get_src_pwd ()
> > > > > result survives.  I vaguely remember GC being happy with heap
> > > > > strings (due to identifiers?), but not sure.  Otherwise the patch 
> > > > > looks
> > > > > obvious enough.
> > > >
> > > > Good point, remap_filename can return the original filename pointer
> > > > (which can point to env["PWD"] or to an allocated string) or a ggc 
> > > > string.
> > > >
> > > > Since not freeing the string pointed to by a static variable is ok (as
> > > > getpwd does it), what about checking if remap_filename just returns
> > > > the input filename ptr again, and if not copy the ggc string into an
> > > > allocated buffer.
> > >
> > > I think we always want GC allocated storage here since the whole dwarf2out
> > > data structures live in GC memory.  That means unconditionally copying 
> > > there
> > > as is done inside the DWARF2_DIR_SHOULD_END_WITH_SEPARATOR path.
> >
> > Explicitly requesting gc storage (A or B below) doesn't seem to work. The 
> > mapped
> > wd string is not modified this time but is placed in a very different part 
> > of the
> > assembly, in .section .debug_str.dwo, and after assembling it does not even 
> > make it
> > into object, not even in a corrupted form as previously.
> 
> With the cache variable moved to file scope and marked GTY(())?  That
> sounds odd.

Looking at the generated gt-dwarf2out.h shows that only global scope variables 
can work 
since these are referenced directly, thus a file scope GTY(()) marker results 
in a compilation 
error.
It was my impression that the ggc_alloc...() functions also notify the gc 
mechanism 
explicitly, but that does not seem to be the case as the attached data is lost 
or at least 
misplaced as well.

The globally scoped static cache variable marked GTY(()) of course works - but 
what 
happens when a marked pointer gets assigned a non-gc pointer?

> The point is that the garbage collector shouldn't end up at heap
> allocated storage when walking GC allocated data and I see the DWARF 
> attribute 
> eventually using the result of this function has the char * pointer GTY(()) 
> marked.

Does this answer my question above? "Shouldn't" as in the GC will ignore a non 
gc heap 
pointer and not touch it, or it shouldn't as in bad things happen when it does?


And I still think this function and the static variable which never changes 
once set does not 
require any GC. Just setting the cached_wd variable to the unchanged pointer 
from 
get_src_pwd() or allocating one in the function itself is enough. This solves 
the problem 
and relieves the GC from the task of watching over a variable which lives 
forever anyhow.


Fix use of COMPLETE_TYPE_P for -Wstrict-aliasing=1

2019-03-21 Thread Richard Sandiford
The handling of -Wstrict-aliasing=1 applied COMPLETE_TYPE_P to the
pointer type rather than the pointer target, so missed the warnings
for "struct incomplete" in the testcase.

I couldn't find any existing C tests for -Wstrict-aliasing=1,
so I added a few extra tests besides the ones fixed by the patch.
I'm sure there's lots more we could test -- this is just supposed
to be better than the status quo (i.e. nothing).

Tested on aarch64-linux-gnu.  Not sure this is a regression in any
meaningful sense, so I guess it should wait for GCC 10.  OK for stage 1?

Richard


2019-03-21  Richard Sandiford  

gcc/c-family/
* c-warn.c (strict_aliasing_warning): Apply COMPLETE_TYPE_P to
the pointer target rather than the pointer itself.

gcc/testsuite/
* gcc.dg/alias-16.c: New test.

Index: gcc/c-family/c-warn.c
===
--- gcc/c-family/c-warn.c   2019-03-21 13:01:18.0 +
+++ gcc/c-family/c-warn.c   2019-03-21 13:01:18.453582721 +
@@ -746,7 +746,7 @@ strict_aliasing_warning (location_t loc,
 are not revealed at higher levels.  */
   alias_set_type set1 = get_alias_set (TREE_TYPE (otype));
   alias_set_type set2 = get_alias_set (TREE_TYPE (type));
-  if (!COMPLETE_TYPE_P (type)
+  if (!COMPLETE_TYPE_P (TREE_TYPE (type))
  || !alias_sets_must_conflict_p (set1, set2))
{
  warning_at (loc, OPT_Wstrict_aliasing,
Index: gcc/testsuite/gcc.dg/alias-16.c
===
--- /dev/null   2019-03-08 11:40:14.606883727 +
+++ gcc/testsuite/gcc.dg/alias-16.c 2019-03-21 13:01:18.453582721 +
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-Wstrict-aliasing=1 -fstrict-aliasing" } */
+
+struct incomplete;
+struct s1 { int i; };
+struct s2 { double d; };
+
+void
+f (int *i, double *d, struct s1 *s1, struct s2 *s2, char *c)
+{
+  (char *) i;
+  (char *) d;
+  (char *) s1;
+  (char *) s2;
+  (char *) c;
+
+  (int *) i;
+  (int *) d; /* { dg-warning "dereferencing type-punned pointer might break 
strict-aliasing rules" } */
+  (int *) s1; /* { dg-warning "dereferencing type-punned pointer might break 
strict-aliasing rules" } */
+  (int *) s2; /* { dg-warning "dereferencing type-punned pointer might break 
strict-aliasing rules" } */
+  (int *) c;
+
+  (double *) i; /* { dg-warning "dereferencing type-punned pointer might break 
strict-aliasing rules" } */
+  (double *) d;
+  (double *) s1; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (double *) s2; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (double *) c;
+
+  (struct incomplete *) i; /* { dg-warning "dereferencing type-punned pointer 
might break strict-aliasing rules" } */
+  (struct incomplete *) d; /* { dg-warning "dereferencing type-punned pointer 
might break strict-aliasing rules" } */
+  (struct incomplete *) s1; /* { dg-warning "dereferencing type-punned pointer 
might break strict-aliasing rules" } */
+  (struct incomplete *) s2; /* { dg-warning "dereferencing type-punned pointer 
might break strict-aliasing rules" } */
+  (struct incomplete *) c; /* { dg-warning "dereferencing type-punned pointer 
might break strict-aliasing rules" } */
+
+  (struct s1 *) i; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s1 *) d; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s1 *) s1;
+  (struct s1 *) s2; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s1 *) c;
+
+  (struct s2 *) i; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s2 *) d; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s2 *) s1; /* { dg-warning "dereferencing type-punned pointer might 
break strict-aliasing rules" } */
+  (struct s2 *) s2;
+  (struct s2 *) c;
+}


Re: Fix use of COMPLETE_TYPE_P for -Wstrict-aliasing=1

2019-03-21 Thread Richard Biener
On Thu, Mar 21, 2019 at 2:04 PM Richard Sandiford
 wrote:
>
> The handling of -Wstrict-aliasing=1 applied COMPLETE_TYPE_P to the
> pointer type rather than the pointer target, so missed the warnings
> for "struct incomplete" in the testcase.
>
> I couldn't find any existing C tests for -Wstrict-aliasing=1,
> so I added a few extra tests besides the ones fixed by the patch.
> I'm sure there's lots more we could test -- this is just supposed
> to be better than the status quo (i.e. nothing).
>
> Tested on aarch64-linux-gnu.  Not sure this is a regression in any
> meaningful sense, so I guess it should wait for GCC 10.  OK for stage 1?

OK for stage 1.

Richard.

> Richard
>
>
> 2019-03-21  Richard Sandiford  
>
> gcc/c-family/
> * c-warn.c (strict_aliasing_warning): Apply COMPLETE_TYPE_P to
> the pointer target rather than the pointer itself.
>
> gcc/testsuite/
> * gcc.dg/alias-16.c: New test.
>
> Index: gcc/c-family/c-warn.c
> ===
> --- gcc/c-family/c-warn.c   2019-03-21 13:01:18.0 +
> +++ gcc/c-family/c-warn.c   2019-03-21 13:01:18.453582721 +
> @@ -746,7 +746,7 @@ strict_aliasing_warning (location_t loc,
>  are not revealed at higher levels.  */
>alias_set_type set1 = get_alias_set (TREE_TYPE (otype));
>alias_set_type set2 = get_alias_set (TREE_TYPE (type));
> -  if (!COMPLETE_TYPE_P (type)
> +  if (!COMPLETE_TYPE_P (TREE_TYPE (type))
>   || !alias_sets_must_conflict_p (set1, set2))
> {
>   warning_at (loc, OPT_Wstrict_aliasing,
> Index: gcc/testsuite/gcc.dg/alias-16.c
> ===
> --- /dev/null   2019-03-08 11:40:14.606883727 +
> +++ gcc/testsuite/gcc.dg/alias-16.c 2019-03-21 13:01:18.453582721 +
> @@ -0,0 +1,46 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Wstrict-aliasing=1 -fstrict-aliasing" } */
> +
> +struct incomplete;
> +struct s1 { int i; };
> +struct s2 { double d; };
> +
> +void
> +f (int *i, double *d, struct s1 *s1, struct s2 *s2, char *c)
> +{
> +  (char *) i;
> +  (char *) d;
> +  (char *) s1;
> +  (char *) s2;
> +  (char *) c;
> +
> +  (int *) i;
> +  (int *) d; /* { dg-warning "dereferencing type-punned pointer might break 
> strict-aliasing rules" } */
> +  (int *) s1; /* { dg-warning "dereferencing type-punned pointer might break 
> strict-aliasing rules" } */
> +  (int *) s2; /* { dg-warning "dereferencing type-punned pointer might break 
> strict-aliasing rules" } */
> +  (int *) c;
> +
> +  (double *) i; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (double *) d;
> +  (double *) s1; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (double *) s2; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (double *) c;
> +
> +  (struct incomplete *) i; /* { dg-warning "dereferencing type-punned 
> pointer might break strict-aliasing rules" } */
> +  (struct incomplete *) d; /* { dg-warning "dereferencing type-punned 
> pointer might break strict-aliasing rules" } */
> +  (struct incomplete *) s1; /* { dg-warning "dereferencing type-punned 
> pointer might break strict-aliasing rules" } */
> +  (struct incomplete *) s2; /* { dg-warning "dereferencing type-punned 
> pointer might break strict-aliasing rules" } */
> +  (struct incomplete *) c; /* { dg-warning "dereferencing type-punned 
> pointer might break strict-aliasing rules" } */
> +
> +  (struct s1 *) i; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s1 *) d; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s1 *) s1;
> +  (struct s1 *) s2; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s1 *) c;
> +
> +  (struct s2 *) i; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s2 *) d; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s2 *) s1; /* { dg-warning "dereferencing type-punned pointer might 
> break strict-aliasing rules" } */
> +  (struct s2 *) s2;
> +  (struct s2 *) c;
> +}


Re: [PATCH] make ggc pick up comp_dir_string() cache value. (dwarf2out.c)

2019-03-21 Thread Richard Biener
On Thu, Mar 21, 2019 at 1:38 PM Otto, Thomas  wrote:
>
> > > > > > > "ggc_collect() discarding/reusing remap_debug_filename() output,
> > > > > > > thus producing invalid objects"
> > > > > >
> > > > > > Hmm, but AFAICS it can end up on the heap if plain get_src_pwd ()
> > > > > > result survives.  I vaguely remember GC being happy with heap
> > > > > > strings (due to identifiers?), but not sure.  Otherwise the patch 
> > > > > > looks
> > > > > > obvious enough.
> > > > >
> > > > > Good point, remap_filename can return the original filename pointer
> > > > > (which can point to env["PWD"] or to an allocated string) or a ggc 
> > > > > string.
> > > > >
> > > > > Since not freeing the string pointed to by a static variable is ok (as
> > > > > getpwd does it), what about checking if remap_filename just returns
> > > > > the input filename ptr again, and if not copy the ggc string into an
> > > > > allocated buffer.
> > > >
> > > > I think we always want GC allocated storage here since the whole 
> > > > dwarf2out
> > > > data structures live in GC memory.  That means unconditionally copying 
> > > > there
> > > > as is done inside the DWARF2_DIR_SHOULD_END_WITH_SEPARATOR path.
> > >
> > > Explicitly requesting gc storage (A or B below) doesn't seem to work. The 
> > > mapped
> > > wd string is not modified this time but is placed in a very different 
> > > part of the
> > > assembly, in .section .debug_str.dwo, and after assembling it does not 
> > > even make it
> > > into object, not even in a corrupted form as previously.
> >
> > With the cache variable moved to file scope and marked GTY(())?  That
> > sounds odd.
>
> Looking at the generated gt-dwarf2out.h shows that only global scope 
> variables can work
> since these are referenced directly, thus a file scope GTY(()) marker results 
> in a compilation
> error.
> It was my impression that the ggc_alloc...() functions also notify the gc 
> mechanism
> explicitly, but that does not seem to be the case as the attached data is 
> lost or at least
> misplaced as well.
>
> The globally scoped static cache variable marked GTY(()) of course works - 
> but what
> happens when a marked pointer gets assigned a non-gc pointer?
>
> > The point is that the garbage collector shouldn't end up at heap
> > allocated storage when walking GC allocated data and I see the DWARF 
> > attribute
> > eventually using the result of this function has the char * pointer GTY(()) 
> > marked.
>
> Does this answer my question above? "Shouldn't" as in the GC will ignore a 
> non gc heap
> pointer and not touch it, or it shouldn't as in bad things happen when it 
> does?

I believe bad things may happen.  But it seems that at least for
strings we do know a mix
of heap/GC allocation gracefully.

> And I still think this function and the static variable which never changes 
> once set does not
> require any GC. Just setting the cached_wd variable to the unchanged pointer 
> from
> get_src_pwd() or allocating one in the function itself is enough. This solves 
> the problem
> and relieves the GC from the task of watching over a variable which lives 
> forever anyhow.

Yes, that works as well then.

Richard.


Re: [PATCH] Restore defensive check in tree_nop_conversion (also PR89779 "fix")

2019-03-21 Thread Richard Biener
On Thu, 21 Mar 2019, Richard Biener wrote:

> 
> This also avoids the ICE in PR89779 but IMHO is not a real fix.
> 
> Still it restores a previously active check against released SSA names
> which now have error_mark_node type rather than NULL.  The new way
> opens up consolidation so I've adjusted tree_nop_conversion plus
> operand_equal_p which got a defensive check at the same time
> (and checks for error_mark_node right before that check).
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> The testcase comes with the other patch (but still included below).

Ugh, looks like my defense installed for released SSA names is now
relied upon by the C++ FE via

#2  0x01864a96 in tree_strip_nop_conversions (
exp=)
at /space/rguenther/src/svn/trunk2/gcc/tree.c:12848
#3  0x00ab2d2f in iterative_hash_template_arg (
arg=, val=3363282904)
at /space/rguenther/src/svn/trunk2/gcc/cp/pt.c:1751

for

 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x768985e8 precision:32 min  
max 
pointer_to_this >
   
arg:0 >
arg:1 >>>

where both the SCOPE_REF and TEMPLATE_TYPE_PARM have NULL
TREE_TYPE.

So I'll restore the NULL check in tree_nop_conversion.

But since the C++ FE overloads almost every tree field possible
I wonder whether that STRIP_NOPS in iterative_hash_template_arg
is a good idea?

Also I hope the FE doesn't call into operand_equal_p with such
typed trees.

Updated patch below, retesting now.

Thanks,
Richard.

2019-03-21  Richard Biener  

PR tree-optimization/89779
* tree.c (tree_nop_conversion): Consolidate and fix defensive
checks with respect to released SSA names now having error_mark_node
type.
* fold-const.c (operand_equal_p): Likewise.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 269832)
+++ gcc/tree.c  (working copy)
@@ -12812,13 +12812,10 @@ tree_nop_conversion (const_tree exp)
   if (!CONVERT_EXPR_P (exp)
   && TREE_CODE (exp) != NON_LVALUE_EXPR)
 return false;
-  if (TREE_OPERAND (exp, 0) == error_mark_node)
-return false;
 
   outer_type = TREE_TYPE (exp);
   inner_type = TREE_TYPE (TREE_OPERAND (exp, 0));
-
-  if (!inner_type)
+  if (!inner_type || inner_type == error_mark_node)
 return false;
 
   return tree_nop_conversion_p (outer_type, inner_type);
Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 269832)
+++ gcc/fold-const.c(working copy)
@@ -2973,11 +2973,6 @@ operand_equal_p (const_tree arg0, const_
   || TREE_TYPE (arg1) == error_mark_node)
 return 0;
 
-  /* Similar, if either does not have a type (like a released SSA name), 
- they aren't equal.  */
-  if (!TREE_TYPE (arg0) || !TREE_TYPE (arg1))
-return 0;
-
   /* We cannot consider pointers to different address space equal.  */
   if (POINTER_TYPE_P (TREE_TYPE (arg0))
   && POINTER_TYPE_P (TREE_TYPE (arg1))
Index: gcc/testsuite/gcc.dg/torture/pr89779.c
===
--- gcc/testsuite/gcc.dg/torture/pr89779.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89779.c  (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+typedef int a;
+void h(a);
+void c(a *d, int b)
+{
+  int e, f, g;
+  for (; e; e++)
+for (f = 0; f < 4; f++)
+  if (d)
+   for (g = e + 1; g; g++)
+ h(d[g]);
+}
+void i()
+{
+  a *j;
+  int k, l;
+  for (; k; k++)
+c(j, l);
+}



[PATCH] In C++17 should not put special functions in global namespace

2019-03-21 Thread Jonathan Wakely

IS 29124 8.2 [sf.mathh] says that  should add the names of the
special functions to the global namespace.  However, C++17 Annex D
[depr.c.headers] excludes those functions explicitly, so they should not
be placed in the global namespace unconditionally for C++17.

Only add them to the global namespace when IS 29124 is explicitly
requested via the __STDCPP_WANT_MATH_SPEC_FUNCS__ macro.

* include/c_compatibility/math.h [!__STDCPP_WANT_MATH_SPEC_FUNCS__]
(assoc_laguerre, assoc_laguerref, assoc_laguerrel, assoc_legendre)
(assoc_legendref, assoc_legendrel, beta, betaf, betal, comp_ellint_1)
(comp_ellint_1f, comp_ellint_1l, comp_ellint_2, comp_ellint_2f)
(comp_ellint_2l, comp_ellint_3, comp_ellint_3f, comp_ellint_3l)
(cyl_bessel_i, cyl_bessel_if, cyl_bessel_il, cyl_bessel_j)
(cyl_bessel_jf, cyl_bessel_jl, cyl_bessel_k, cyl_bessel_kf)
(cyl_bessel_kl, cyl_neumann, cyl_neumannf, cyl_neumannl, ellint_1)
(ellint_1f, ellint_1l, ellint_2, ellint_2f, ellint_2l, ellint_3)
(ellint_3f, ellint_3l, expint, expintf, expintl, hermite, hermitef)
(hermitel, laguerre, laguerref, laguerrel, legendre, legendref)
(legendrel, riemann_zeta, riemann_zetaf, riemann_zetal, sph_bessel)
(sph_besself, sph_bessell, sph_legendre, sph_legendref, sph_legendrel)
(sph_neumann, sph_neumannf, sph_neumannl): Only add using-declarations
when the special functions IS is enabled, not for C++17.
* testsuite/26_numerics/headers/cmath/functions_global_c++17.cc:
Replace with ...
* testsuite/26_numerics/headers/cmath/functions_global.cc: New test,
without checks for special functions in C++17.
* testsuite/26_numerics/headers/cmath/special_functions_global.cc:
New test.

Tested powerpc64le-linux, committed to trunk.

The special functions should never have been added to the global
namespace for C++17, so this is a regression on trunk and
gcc-8-branch. That means the fix needs to be backported too.

commit dfaf42512aed1ff2d360abfc4802e266c6724ed6
Author: Jonathan Wakely 
Date:   Thu Mar 21 12:41:57 2019 +

In C++17  should not put special functions in global namespace

IS 29124 8.2 [sf.mathh] says that  should add the names of the
special functions to the global namespace.  However, C++17 Annex D
[depr.c.headers] excludes those functions explicitly, so they should not
be placed in the global namespace unconditionally for C++17.

Only add them to the global namespace when IS 29124 is explicitly
requested via the __STDCPP_WANT_MATH_SPEC_FUNCS__ macro.

* include/c_compatibility/math.h [!__STDCPP_WANT_MATH_SPEC_FUNCS__]
(assoc_laguerre, assoc_laguerref, assoc_laguerrel, assoc_legendre)
(assoc_legendref, assoc_legendrel, beta, betaf, betal, 
comp_ellint_1)
(comp_ellint_1f, comp_ellint_1l, comp_ellint_2, comp_ellint_2f)
(comp_ellint_2l, comp_ellint_3, comp_ellint_3f, comp_ellint_3l)
(cyl_bessel_i, cyl_bessel_if, cyl_bessel_il, cyl_bessel_j)
(cyl_bessel_jf, cyl_bessel_jl, cyl_bessel_k, cyl_bessel_kf)
(cyl_bessel_kl, cyl_neumann, cyl_neumannf, cyl_neumannl, ellint_1)
(ellint_1f, ellint_1l, ellint_2, ellint_2f, ellint_2l, ellint_3)
(ellint_3f, ellint_3l, expint, expintf, expintl, hermite, hermitef)
(hermitel, laguerre, laguerref, laguerrel, legendre, legendref)
(legendrel, riemann_zeta, riemann_zetaf, riemann_zetal, sph_bessel)
(sph_besself, sph_bessell, sph_legendre, sph_legendref, 
sph_legendrel)
(sph_neumann, sph_neumannf, sph_neumannl): Only add 
using-declarations
when the special functions IS is enabled, not for C++17.
* testsuite/26_numerics/headers/cmath/functions_global_c++17.cc:
Replace with ...
* testsuite/26_numerics/headers/cmath/functions_global.cc: New test,
without checks for special functions in C++17.
* testsuite/26_numerics/headers/cmath/special_functions_global.cc:
New test.

diff --git a/libstdc++-v3/include/c_compatibility/math.h 
b/libstdc++-v3/include/c_compatibility/math.h
index d9fe94ca26e..7e8dab03e41 100644
--- a/libstdc++-v3/include/c_compatibility/math.h
+++ b/libstdc++-v3/include/c_compatibility/math.h
@@ -111,7 +111,9 @@ using std::tgamma;
 using std::trunc;
 #endif // C++11 && _GLIBCXX_USE_C99_MATH_TR1
 
-#if _GLIBCXX_USE_STD_SPEC_FUNCS
+// The mathematical special functions are only added to the global namespace
+// by IS 29124, but not by C++17.
+#if __cplusplus >= 201103L && __STDCPP_WANT_MATH_SPEC_FUNCS__ != 0
 using std::assoc_laguerref;
 using std::assoc_laguerrel;
 using std::assoc_laguerre;
diff --git 
a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_global.cc 
b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_global.cc
new file mode 100644
index 

Re: [PATCH] Allow lazy construction of hash_{table,set,map}

2019-03-21 Thread Jason Merrill

On 3/21/19 5:03 AM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 09:28:06AM +0100, Richard Biener wrote:

Well, this is surely more costly in that it allocates both the hash_table
structure and the memory pointed by it from GC.


You could use

   char constexpr_fundef_table[sizeof(hash-table)];
   bool constexpr_fundef_table_initialized_p = false;

   if (!constexpr_fundef_table_initialized_p)
 new (constexpr_fundef_table) hash-table (...);

and hide this pattern behind a new RAII class?


Do we have a portable C++98 way to make sure the buffer has proper alignment
though?  Plus we'd need to pass around this auto_hash_set instead of
hash_set, so from the usage POV it would be like the following patches
where the functionality is embedded in hash_set instantiations itself with a
special new template parameter.


I guess another option would be to make the decision whether
the hash_{table,set,map} is constructed with allocation right away (and no
runtime checks for it) or if it is lazy another template argument (bool
Lazy = false before the Allocator template argument), if !Lazy, it would
work exactly as it is now, if Lazy it would not allocate it in the ctor
and where needed would add m_entries checks.


That's a possibility, but what about the above RAII trick?  Not sure
if there's a way to avoid the char[] use and instead use "unconstructed"
properly typed storage.



Attached (so far without changelog, selftest additions and only tested with
make -j32 -k check-c++-all RUNTESTFLAGS="dg.exp='pr89767.C *lambda* *desig*'"
) is
1) hash_{table,set}  implementation for lazy allocation


I would expect that a single test on an object that's already in cache 
in the context of hash table operations would be insignificant, but if 
you really want to offer that choice this seems like a clean enough way 
to implement it.  Are you sure that we want the default to be immediate 
allocation?



2) the PR c++/89767 patch with optimization for zero and one entries
3) incremental PR c++/71446 fix to simplify that code using this new
infrastructure


2 and 3 are explicitly specifying "false" for the Lazy parameter, which 
is the default.


We might do the duplicate checking in a wrapper around add_capture since 
it's only used by cp_parser_lambda_introducer.  It's fine either way.


Jason


[PATCH, RFC] Wrong warning message fix for gcc 9

2019-03-21 Thread Roman Zhuykov
Hello, I have found a minor diagnostic issue.

Since r262910 we got a little odd error messages in cases like this:
$ gcc -flto-compression-level=2-O2 -c empty.c
gcc: error: argument to '-flto-compression-level=' is not between 0 and 9
$ g++ -fabi-version=2-O2 -c empty.cpp
cc1plus: error: '-fabi-version=1' is no longer supported

Old messages were more correct:
$ gcc-8 -flto-compression-level=2-O2 -c empty.c
gcc: error: argument to '-flto-compression-level=' should be a
non-negative integer
$ g++-8 -fabi-version=2-O2 -c empty.cpp
g++: error: argument to '-fabi-version=' should be a non-negative integer

When UInteger option value string is not a number, but it's first char
is a digit, integral_argument function returns -1 without setting
*err.

Proposed untested patch attached.

--
Best Regards,
Roman Zhuykov

gcc/ChangeLog:

2019-03-21  Roman Zhuykov  

* opts-common.c (integral_argument): Set errno properly in one case.

diff --git a/gcc/opts-common.c b/gcc/opts-common.c
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -205,8 +205,10 @@ integral_argument (const char *arg, int *err,
bool byte_size_suffix)
value = strtoull (arg, &end, 0);
if (*end)
  {
-   /* errno is most likely EINVAL here.  */
-   *err = errno;
+   if (errno)
+ *err = errno;
+   else
+ *err = EINVAL;
return -1;
  }


Go patch committed: Add a newline to function receiver in type debug dump

2019-03-21 Thread Ian Lance Taylor
This Go frontend patch by Cherry Zhang fixes the type debug dump code
to add a newline after the function receiver type.  Bootstrapped and
ran Go tests on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 269811)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-6e5ff227d4e77d340e86bd2c5e045d5532c2d7d7
+392e9b3da473070f24dbe6c12c282a0e06e73b54
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/ast-dump.cc
===
--- gcc/go/gofrontend/ast-dump.cc   (revision 269805)
+++ gcc/go/gofrontend/ast-dump.cc   (working copy)
@@ -766,7 +766,7 @@ void Type_dumper::visit_function_type(co
   if (rec != NULL)
 {
   this->emitpre(notag, NULL);
-  this->typeref("receiver ", rec->type(), NULL);
+  this->typeref("receiver ", rec->type(), "\n");
 }
   const Typed_identifier_list* parameters = ft->parameters();
   if (parameters != NULL)


Re: [PATCH] Allow lazy construction of hash_{table,set,map}

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 10:25:25AM -0400, Jason Merrill wrote:
> > Attached (so far without changelog, selftest additions and only tested with
> > make -j32 -k check-c++-all RUNTESTFLAGS="dg.exp='pr89767.C *lambda* 
> > *desig*'"
> > ) is
> > 1) hash_{table,set}  implementation for lazy allocation
> 
> I would expect that a single test on an object that's already in cache in
> the context of hash table operations would be insignificant, but if you

Some hash table operations are in performance sensitive code and even if
those checks wouldn't take too long, it would grow I-cache.

> really want to offer that choice this seems like a clean enough way to
> implement it.  Are you sure that we want the default to be immediate
> allocation?

There are many that do want immediate allocation, so changing the default
looks way too risky to me.

> > 2) the PR c++/89767 patch with optimization for zero and one entries
> > 3) incremental PR c++/71446 fix to simplify that code using this new
> > infrastructure
> 
> 2 and 3 are explicitly specifying "false" for the Lazy parameter, which is
> the default.

Ugh, thanks for spotting that.  Tests passed with that, and I only went
through in a debugger the earlier version with EMPTY_HASH.

> We might do the duplicate checking in a wrapper around add_capture since
> it's only used by cp_parser_lambda_introducer.  It's fine either way.

Looks like a good idea to me.

Attached are new versions of all the 3 patches.
1) now has ChangeLog and hash-set-selftest.c coverage
2) has the false -> true fixed
3) ditto, but furthermore is moved out of add_capture into the lambda
introducer parsing routine only, tests for [this, this] and [this, *this]
etc. are done using LAMBDA_EXPR_THIS_CAPTURE and the duplicate check for
capture_id is simplified too.

Jakub
2019-03-21  Jakub Jelinek  

* hash-table.h (hash_table): Add Lazy template parameter defaulted
to false, if true, don't alloc_entries during construction, but defer
it to the first method that needs m_entries allocated.
(hash_table::hash_table, hash_table::~hash_table,
hash_table::alloc_entries, hash_table::find_empty_slot_for_expand,
hash_table::too_empty_p, hash_table::expand, hash_table::empty_slow,
hash_table::clear_slot, hash_table::traverse_noresize,
hash_table::traverse, hash_table::iterator::slide): Adjust all methods.
* hash-set.h (hash_set): Add Lazy template parameter defaulted to
false.
(hash_set::contains): If Lazy is true, use find_slot_with_hash with
NO_INSERT instead of find_with_hash.
(hash_set::traverse, hash_set::iterator, hash_set::iterator::m_iter,
hash_set::m_table): Add Lazy to template params of hash_table.
(gt_ggc_mx, gt_pch_nx): Use false as Lazy in hash_set template param.
* attribs.c (test_attribute_exclusions): Likewise.
* hash-set-tests.c (test_set_of_strings): Add iterator tests for
hash_set.  Add tests for hash_set with Lazy = true.
c-family/
* c-common.c (per_file_includes_t): Use false as Lazy in hash_set
template param.
jit/
* jit-recording.c (reproducer::m_set_identifiers): Use false as Lazy
in hash_set template param.

--- gcc/hash-table.h.jj 2019-03-14 23:46:28.593616750 +0100
+++ gcc/hash-table.h2019-03-21 09:27:19.722074003 +0100
@@ -167,6 +167,15 @@ along with GCC; see the file COPYING3.
See hash_table for details.  The interface is very similar to libiberty's
htab_t.
 
+   If a hash table is used only in some rare cases, it is possible
+   to construct the hash_table lazily before first use.  This is done
+   through:
+
+  hash_table  some_type_hash_table;
+
+   which will cause whatever methods actually need the allocated entries
+   array to allocate it later.
+
 
EASY DESCRIPTORS FOR POINTERS
 
@@ -241,7 +250,7 @@ along with GCC; see the file COPYING3.
 #include "hash-map-traits.h"
 
 template class hash_map;
-template class hash_set;
+template class hash_set;
 
 /* The ordinary memory allocator.  */
 /* FIXME (crowl): This allocator may be extracted for wider sharing later.  */
@@ -353,8 +362,8 @@ class mem_usage;
  hash table code.
 
 */
-template  class Allocator = xcallocator>
+template  class Allocator = xcallocator>
 class hash_table
 {
   typedef typename Descriptor::value_type value_type;
@@ -422,7 +431,7 @@ public:
  write the value you want into the returned slot.  When inserting an
  entry, NULL may be returned if memory allocation fails. */
   value_type *find_slot_with_hash (const compare_type &comparable,
-   hashval_t hash, enum insert_option insert);
+  hashval_t hash, enum insert_option insert);
 
   /* This function deletes an element with the given COMPARABLE value
  from hash table starting with the given HASH.  If there is no
@@ -472,6 +481,8 @@ public:
 
   iterator begin () const

Re: [PATCH] Allow lazy construction of hash_{table,set,map}

2019-03-21 Thread Jason Merrill

On 3/21/19 1:48 PM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 10:25:25AM -0400, Jason Merrill wrote:

Attached (so far without changelog, selftest additions and only tested with
make -j32 -k check-c++-all RUNTESTFLAGS="dg.exp='pr89767.C *lambda* *desig*'"
) is
1) hash_{table,set}  implementation for lazy allocation


I would expect that a single test on an object that's already in cache in
the context of hash table operations would be insignificant, but if you


Some hash table operations are in performance sensitive code and even if
those checks wouldn't take too long, it would grow I-cache.


really want to offer that choice this seems like a clean enough way to
implement it.  Are you sure that we want the default to be immediate
allocation?


There are many that do want immediate allocation, so changing the default
looks way too risky to me.


2) the PR c++/89767 patch with optimization for zero and one entries
3) incremental PR c++/71446 fix to simplify that code using this new
 infrastructure


2 and 3 are explicitly specifying "false" for the Lazy parameter, which is
the default.


Ugh, thanks for spotting that.  Tests passed with that, and I only went
through in a debugger the earlier version with EMPTY_HASH.


We might do the duplicate checking in a wrapper around add_capture since
it's only used by cp_parser_lambda_introducer.  It's fine either way.


Looks like a good idea to me.

Attached are new versions of all the 3 patches.
1) now has ChangeLog and hash-set-selftest.c coverage
2) has the false -> true fixed
3) ditto, but furthermore is moved out of add_capture into the lambda
introducer parsing routine only, tests for [this, this] and [this, *this]
etc. are done using LAMBDA_EXPR_THIS_CAPTURE and the duplicate check for
capture_id is simplified too.


OK.

Jason



Re: [PATCH] Restore defensive check in tree_nop_conversion (also PR89779 "fix")

2019-03-21 Thread Jason Merrill

On 3/21/19 9:43 AM, Richard Biener wrote:

On Thu, 21 Mar 2019, Richard Biener wrote:



This also avoids the ICE in PR89779 but IMHO is not a real fix.

Still it restores a previously active check against released SSA names
which now have error_mark_node type rather than NULL.  The new way
opens up consolidation so I've adjusted tree_nop_conversion plus
operand_equal_p which got a defensive check at the same time
(and checks for error_mark_node right before that check).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

The testcase comes with the other patch (but still included below).


Ugh, looks like my defense installed for released SSA names is now
relied upon by the C++ FE via

#2  0x01864a96 in tree_strip_nop_conversions (
 exp=)
 at /space/rguenther/src/svn/trunk2/gcc/tree.c:12848
#3  0x00ab2d2f in iterative_hash_template_arg (
 arg=, val=3363282904)
 at /space/rguenther/src/svn/trunk2/gcc/cp/pt.c:1751

for

  
 unit-size 
 align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x768985e8 precision:32 min 
max 
 pointer_to_this >

 arg:0 
 arg:0 >
 arg:1 >>>

where both the SCOPE_REF and TEMPLATE_TYPE_PARM have NULL
TREE_TYPE.

So I'll restore the NULL check in tree_nop_conversion.

But since the C++ FE overloads almost every tree field possible
I wonder whether that STRIP_NOPS in iterative_hash_template_arg
is a good idea?


I think it still makes sense; TREE_TYPE of an expression is either a 
type, error_mark_node, or NULL_TREE (for type-dependent expressions in a 
template).



Also I hope the FE doesn't call into operand_equal_p with such
typed trees.


It shouldn't, we have cp_tree_equal to use instead.

Jason


Re: [C++ debug PATCH] [PR88534] accept VAR_DECL in class literal template parms

2019-03-21 Thread Jason Merrill

On 3/20/19 6:06 PM, Marek Polacek wrote:

On Wed, Mar 20, 2019 at 10:58:32PM +0100, Jakub Jelinek wrote:

On Wed, Mar 20, 2019 at 05:55:04PM -0400, Marek Polacek wrote:

On Wed, Mar 20, 2019 at 04:56:33PM -0300, Alexandre Oliva wrote:

On Mar 20, 2019, Marek Polacek  wrote:


This test fails with
pr88534.C:58:1: sorry, unimplemented: string literal in function template 
signature


Interesting...  gcc-8 rejected it with an error message rejecting the
template parameter, but my latest trunk build (dated Mar 13, r269641)
compiles it all right.  Was there a subsequent fix, maybe?  I didn't
realize it was supposed to be rejected.


Ah, that problem only started with r269814, namely this hunk:


Maybe this is done too early and should be postponed to genericization
(perhaps except for TREE_STATIC vars)?


Or skip when DECL is template_parm_object_p.


Or handle it in mangle.c.  I don't see any reason we shouldn't accept

struct A
{
  char c[4];
};

template  struct B { };
B b;

Probably we should use the same mangling whether the initializer for c 
was spelled as a string literal or list of integers.


The thing we still don't want to allow is mangling the *address* of a 
string literal.


Jason


Re: [fortran, patch] IEEE intrinsic modules (ping)

2019-03-21 Thread Thomas Schwinge
Hi!

On Thu, 28 Feb 2019 20:22:08 +0100, I wrote:
> While looking for something else -- isn't that always how it happens ;-)
> -- I noticed one thing here:
> 
> On Wed, 25 Jun 2014 01:41:02 +0200, FX  wrote:
> > I’ll wait a few more days to commit, so others can comment/review and I am 
> > sure to be around if there is fallout.
> 
> (This got committed to trunk in r212102.)
> 
> > --- gcc/testsuite/gfortran.dg/ieee/ieee.exp (revision 0)
> > +++ gcc/testsuite/gfortran.dg/ieee/ieee.exp (revision 0)
> > @@ -0,0 +1,59 @@
> > +[...]
> > +global DEFAULT_FFLAGS
> > +if ![info exists DEFAULT_FFLAGS] then {
> > +set DEFAULT_FFLAGS ""
> > +}
> > +[...]
> 
> Per my understanding of DejaGnu (and please correct me if that's wrong),
> in the same 'runtest' instance, 'global' variables persist from one
> '*.exp' file to another.  (Which is something debatable, in my
> opinion...)
> 
> All other '*.exp' files that back then did define 'DEFAULT_FFLAGS' (using
> this same construct as shown above), and it's still the same now, are
> using " -pedantic-errors" instead of the empty string.  Thus this setting
> of 'DEFAULT_FFLAGS' is not idempotent, depends on whether
> 'gfortran.dg/ieee/ieee.exp', or an other defining '*.exp' file is
> executed first.
> 
> By default, first comes 'gfortran.dg/coarray/caf.exp' (nowadays, did not
> yet exist back then), then 'gfortran.dg/dg.exp', then
> 'gfortran.dg/ieee/ieee.exp'.  (And, sometimes also
> 'gcc.target/powerpc/ppc-fortran/ppc-fortran.exp'.)
> 
> And, as I just noticed, 'runtest' seems to always sort the specified
> '*.exp' files (?!), so even when you invoke something like
> "check-gcc-fortran RUNTESTFLAGS='ieee.exp dg.exp'" to try to provoke some
> regressions to appear, you'd still get 'dg.exp' executed first.  The
> empty string setting in 'ieee.exp' was never really active -- only if
> executed on its own, etc.
> 
> Fortunately, 'ieee.exp' seems to behave the same way whether running with
> or without '-pedantic-errors', so I propose to simply unify that setting,
> see attached.

I convinced myself that this is the right thing to do, and committed
"[testsuite, Fortran] Consistently set 'DEFAULT_FFLAGS'" to trunk in
r269845, to gcc-8-branch in r269846, and to gcc-7-branch in r269847, see
attached.


Grüße
 Thomas


>From c1769f9f2a8314e610c7a3534ee8fc74fe2c8c60 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 18:54:50 +
Subject: [PATCH] [testsuite, Fortran] Consistently set 'DEFAULT_FFLAGS'

In the same 'runtest' instance, 'global' variables persist from one '*.exp'
file to another.

All other '*.exp' files are using " -pedantic-errors" instead of the empty
string as the default for 'DEFAULT_FFLAGS'.  Thus this setting of
'DEFAULT_FFLAGS' is not idempotent, depends on whether
'gfortran.dg/ieee/ieee.exp', or an other defining '*.exp' file is executed
first.

	gcc/testsuite/
	PR fortran/29383
	* gfortran.dg/ieee/ieee.exp (DEFAULT_FFLAGS): Set the same as in
	other '*.exp' files.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269845 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog |  6 ++
 gcc/testsuite/gfortran.dg/ieee/ieee.exp | 10 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c8f9492130e1..914ba7237033 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2019-03-21  Thomas Schwinge  
+
+	PR fortran/29383
+	* gfortran.dg/ieee/ieee.exp (DEFAULT_FFLAGS): Set the same as in
+	other '*.exp' files.
+
 2019-03-21  Richard Biener  
 
 	PR tree-optimization/89779
diff --git a/gcc/testsuite/gfortran.dg/ieee/ieee.exp b/gcc/testsuite/gfortran.dg/ieee/ieee.exp
index 05383ce94331..68d4b7816144 100644
--- a/gcc/testsuite/gfortran.dg/ieee/ieee.exp
+++ b/gcc/testsuite/gfortran.dg/ieee/ieee.exp
@@ -22,15 +22,15 @@
 load_lib gfortran-dg.exp
 load_lib target-supports.exp
 
-# Initialize `dg'.
-dg-init
-
-# Flags specified in each test
+# If a testcase doesn't have special options, use these.
 global DEFAULT_FFLAGS
 if ![info exists DEFAULT_FFLAGS] then {
-set DEFAULT_FFLAGS ""
+set DEFAULT_FFLAGS " -pedantic-errors"
 }
 
+# Initialize `dg'.
+dg-init
+
 # Flags for finding the IEEE modules
 if [info exists TOOL_OPTIONS] {
set specpath [get_multilibs ${TOOL_OPTIONS}]
-- 
2.17.1

>From e18146b75cb782483c996bf58b96a40f622715a1 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 18:57:39 +
Subject: [PATCH] [testsuite, Fortran] Consistently set 'DEFAULT_FFLAGS'

In the same 'runtest' instance, 'global' variables persist from one '*.exp'
file to another.

All other '*.exp' files are using " -pedantic-errors" instead of the empty
string as the default for 'DEFAULT_FFLAGS'.  Thus this setting of
'DEFAULT_FFLAGS' is not idempotent, depends on whether
'gfortran.dg/ieee/ieee.exp', or an other defining '*.exp' file is executed
first.

	gcc/testsuite/
	PR fortran/29383
	* gfortran.dg/ieee/ieee.exp (DEFAUL

Re: [Patch, Fortran + Testsuite] Fix coarray handling in modules

2019-03-21 Thread Thomas Schwinge
Hi!

On Thu, 28 Feb 2019 18:02:32 +0100, Thomas Schwinge  
wrote:
> On Fri, 02 Jan 2015 12:28:10 +0100, Tobias Burnus  wrote:
> > [...]
> > 
> > I additionally propagated the dg-compile-aux-modules support to caf.dg 
> 
> That got committed in r219143:
> 
> > --- a/gcc/testsuite/gfortran.dg/coarray/caf.exp
> > +++ b/gcc/testsuite/gfortran.dg/coarray/caf.exp
> > @@ -43,6 +43,21 @@ global DG_TORTURE_OPTIONS torture_with_loops
> >  torture-init
> >  set-torture-options $DG_TORTURE_OPTIONS
> >  
> > +global gfortran_test_path
> > +global gfortran_aux_module_flags
> > +set gfortran_test_path $srcdir/$subdir
> > +set gfortran_aux_module_flags $DEFAULT_FFLAGS
> > +proc dg-compile-aux-modules { args } {
> > +global gfortran_test_path
> > +global gfortran_aux_module_flags
> > +if { [llength $args] != 2 } {
> > +   error "dg-set-target-env-var: needs one argument"
> > +   return
> > +}
> > +dg-test $gfortran_test_path/[lindex $args 1] "" 
> > $gfortran_aux_module_flags
> > +# cleanup-modules isn't intentionally invoked here.
> > +}
> 
> I just noticed that this copy is missing Jakub's r215293 changes that he
> had applied a few months *earlier* to the master copy of
> 'dg-compile-aux-modules', in 'gcc/testsuite/gfortran.dg/dg.exp', see
> attached, and/or
> .
> I can't easily test it with the affected DejaGnu version 1.4.4 (which is
> still the minimum version required now), but it tests fine with DejaGnu
> 1.5, and seems straight-forward, so I propose I commit "as obvious" these
> changes to the 'gcc/testsuite/gfortran.dg/coarray/caf.exp' copy, too?
> (It's exactly these changes missing to make the two copies identical.)

For testing GCC with a custom DejaGnu installation, you apparently just
have to put its installation directory into 'PATH'.

Testing with DejaGnu 1.4.4 (specifically, the 'dejagnu-1.4.4-release' Git
tag), one runs into:

cannot trap SIGSEGV
while executing
"trap "send_error \"got a \[trap -name\] signal, $str \\n\"; log_and_exit;" 
$signal"
("foreach" body line 4)
invoked from within
"foreach sig "{SIGTERM {terminated}}  {SIGINT  {interrupted by user}}  
{SIGQUIT {interrupted by user}}  {SIGSEGV {segmentation violation}}" {
set sign..."
invoked from within
"if ![exp_debug] {
foreach sig "{SIGTERM {terminated}} \
 {SIGINT  {interrupted by user}} \
 {SIGQUIT {interrupted by user}..."
(file "[...]/share/dejagnu/runtest.exp" line 1503)

This is an incompatibility with "recent" versions of 'expect'; see
, for example, and this is resolved by
DejaGnu commit 504776814fa56295c4cff40d78a1be446f851a7c.


With that resolved, no problems found -- that is, confirming that 15
years old DejaGnu version 1.4.4 still works fine for GCC testing (in my
configuration).


I convinced myself that my proposed changes are the right thing to do,
and committed "[testsuite, Fortran] Apply DejaGnu 1.4.4 work-around also
to 'gfortran.dg/coarray/caf.exp:dg-compile-aux-modules'" to trunk in
r269848, to gcc-8-branch in r269849, and to gcc-7-branch in r269850, see
attached.


> The other copy, created later, in
> 'gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp' already
> does exactly match the master copy.
> 
> 
> And then, of course, we really should unify all 'dg-compile-aux-modules'
> copies into one shared file.  (But it's not completely straight-forward,
> because of the handling of 'gfortran_test_path', and
> 'gfortran_aux_module_flags'.)


Grüße
 Thomas


>From e78648a61e4e5fc60d400c2d2c89254aee4c0715 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:16:29 +
Subject: [PATCH] [testsuite, Fortran] Apply DejaGnu 1.4.4 work-around also to
 'gfortran.dg/coarray/caf.exp:dg-compile-aux-modules'

See trunk r215293.  This unifies all 'dg-compile-aux-modules' instances.

	gcc/testsuite/
	PR fortran/56408
	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Workaround
	missing nexted dg-test call support in dejaGNU 1.4.4.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269848 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog   |  4 
 gcc/testsuite/gfortran.dg/coarray/caf.exp | 12 +++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 914ba7237033..40446965212e 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/56408
+	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Workaround
+	missing nexted dg-test call support in dejaGNU 1.4.4.
+
 	PR fortran/29383
 	* gfortran.dg/ieee/ieee.exp (DEFAULT_FFLAGS): Set the same as in
 	other '*.exp' files.
diff --git a/gcc/testsuite/gfortran.dg/coarray/caf.exp b/gcc/testsuite/gfortran.dg/coarray/caf.exp
index 4c6dba1dc22b..e3204c34779b 

Re: [PATCH] Avoid inter-test dependencies in gfortran.dg (PR fortran/56408)

2019-03-21 Thread Thomas Schwinge
Hi!

On Mon, 15 Sep 2014 18:13:47 +0200, Jakub Jelinek  wrote:
> --- gcc/testsuite/gfortran.dg/dg.exp.jj   2014-07-04 10:20:35.0 
> +0200
> +++ gcc/testsuite/gfortran.dg/dg.exp  2014-09-15 17:05:04.038126245 +0200

> +proc dg-compile-aux-modules { args } {
> +global gfortran_test_path
> +global gfortran_aux_module_flags
> +if { [llength $args] != 2 } {
> + error "dg-set-target-env-var: needs one argument"

As obvious committed "[testsuite] Fix 'dg-compile-aux-modules'
diagnostic" to trunk in r269851, to gcc-8-branch in r269852, and to
gcc-7-branch in r269853, see attached.


Grüße
 Thomas


>From f2137b85e1a6b044a2398de20ef8d458734dc6e6 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:29:57 +
Subject: [PATCH] [testsuite] Fix 'dg-compile-aux-modules' diagnostic

	gcc/testsuite/
	PR fortran/56408
	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
	(dg-compile-aux-modules): Fix diagnostic.
	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Likewise.
	* gfortran.dg/dg.exp (dg-compile-aux-modules): Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269851 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog | 6 ++
 .../gcc.target/powerpc/ppc-fortran/ppc-fortran.exp  | 2 +-
 gcc/testsuite/gfortran.dg/coarray/caf.exp   | 2 +-
 gcc/testsuite/gfortran.dg/dg.exp| 2 +-
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 40446965212e..a5211cca6c6a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,11 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/56408
+	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
+	(dg-compile-aux-modules): Fix diagnostic.
+	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Likewise.
+	* gfortran.dg/dg.exp (dg-compile-aux-modules): Likewise.
+
 	PR fortran/56408
 	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Workaround
 	missing nexted dg-test call support in dejaGNU 1.4.4.
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
index 53c3c7464423..586d7dd3019e 100644
--- a/gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
@@ -36,7 +36,7 @@ proc dg-compile-aux-modules { args } {
 global gfortran_test_path
 global gfortran_aux_module_flags
 if { [llength $args] != 2 } {
-	error "dg-set-target-env-var: needs one argument"
+	error "dg-compile-aux-modules: needs one argument"
 	return
 }
 
diff --git a/gcc/testsuite/gfortran.dg/coarray/caf.exp b/gcc/testsuite/gfortran.dg/coarray/caf.exp
index e3204c34779b..68b60b86a949 100644
--- a/gcc/testsuite/gfortran.dg/coarray/caf.exp
+++ b/gcc/testsuite/gfortran.dg/coarray/caf.exp
@@ -51,7 +51,7 @@ proc dg-compile-aux-modules { args } {
 global gfortran_test_path
 global gfortran_aux_module_flags
 if { [llength $args] != 2 } {
-	error "dg-set-target-env-var: needs one argument"
+	error "dg-compile-aux-modules: needs one argument"
 	return
 }
 
diff --git a/gcc/testsuite/gfortran.dg/dg.exp b/gcc/testsuite/gfortran.dg/dg.exp
index 53c3c7464423..586d7dd3019e 100644
--- a/gcc/testsuite/gfortran.dg/dg.exp
+++ b/gcc/testsuite/gfortran.dg/dg.exp
@@ -36,7 +36,7 @@ proc dg-compile-aux-modules { args } {
 global gfortran_test_path
 global gfortran_aux_module_flags
 if { [llength $args] != 2 } {
-	error "dg-set-target-env-var: needs one argument"
+	error "dg-compile-aux-modules: needs one argument"
 	return
 }
 
-- 
2.17.1

>From 800851c2ac0eac988a26e83fa879d5b6bad9f10b Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:31:09 +
Subject: [PATCH] [testsuite] Fix 'dg-compile-aux-modules' diagnostic

	gcc/testsuite/
	PR fortran/56408
	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
	(dg-compile-aux-modules): Fix diagnostic.
	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Likewise.
	* gfortran.dg/dg.exp (dg-compile-aux-modules): Likewise.

trunk r269851

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@269852 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog | 6 ++
 .../gcc.target/powerpc/ppc-fortran/ppc-fortran.exp  | 2 +-
 gcc/testsuite/gfortran.dg/coarray/caf.exp   | 2 +-
 gcc/testsuite/gfortran.dg/dg.exp| 2 +-
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 38982cf82f4b..fb11208d2c8e 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,11 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/56408
+	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
+	(dg-compile-aux-modules): Fix diagnostic.
+	* gfortran.dg/coarray/caf.exp (dg-compile-aux-modules): Li

Re: [PR72741] Encode OpenACC 'routine' directive's level of parallelism inside Fortran module files

2019-03-21 Thread Thomas Schwinge
Hi!

On Wed, 13 Mar 2019 18:50:38 +0100, I wrote:
> On Thu, 28 Feb 2019 22:17:01 +0100, Jakub Jelinek  wrote:
> > On Thu, Feb 28, 2019 at 10:12:00PM +0100, Thomas Schwinge wrote:
> > > On Wed, 15 Jun 2016 20:12:15 -0700, Cesar Philippidis 
> > >  wrote:
> > > The code changes now are actually very simple.  The "problem" is that
> > > we're incrementing the Fortran module version, 'MOD_VERSION', which
> > > breaks binary compatibility with Fortran module files created with
> > > earlier versions of GCC, which is something that is to be avoided, as
> > > I've heard.  Or, is it not that bad actually?
> > 
> > It is bad and we certainly shouldn't change it on release branches.
> 
> ACK.
> 
> > There are many ways to deal with it without bumping MOD_VERSION in a
> > backwards but not forwards compatible way, so that a newer compiler will be
> > able to parse old *.mod files, and newer compiler new ones as long as this
> > problematic stuff doesn't appear in.
> 
> Like the attached, actually pretty simple now?
> 
> It may seem wasteful to use individual bits for 'gang', 'worker',
> 'vector', 'seq', but that makes it easy to implement the behavior
> described above by Jakub, and I've heard rumors that OpenACC might at
> some point allow several of these level of parallelism clauses to be
> specified (plus a new 'auto' clause?), and then this will be necessary
> anyway (several of these bits can then in fact appear).

Committed to trunk in r269854 "[testsuite, Fortran] Provide
'dg-compile-aux-modules' in 'gfortran.dg/goacc/goacc.exp'", r269855
"[PR72741] Encode OpenACC 'routine' directive's level of parallelism
inside Fortran module files", see attached.


Grüße
 Thomas


From 44ff9fb6a5f3db5c2db521bdd2aed0d2c24e1f83 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:44:34 +
Subject: [PATCH 1/2] [testsuite, Fortran] Provide 'dg-compile-aux-modules' in
 'gfortran.dg/goacc/goacc.exp'

..., as yet another copy from 'gfortran.dg/dg.exp', which there are a few
already.

	gcc/testsuite/
	* gfortran.dg/goacc/goacc.exp (dg-compile-aux-modules): New proc.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269854 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog   |  2 ++
 gcc/testsuite/gfortran.dg/goacc/goacc.exp | 25 +++
 2 files changed, 27 insertions(+)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a5211cca6c6a..1f2f3eba6f40 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,7 @@
 2019-03-21  Thomas Schwinge  
 
+	* gfortran.dg/goacc/goacc.exp (dg-compile-aux-modules): New proc.
+
 	PR fortran/56408
 	* gcc.target/powerpc/ppc-fortran/ppc-fortran.exp
 	(dg-compile-aux-modules): Fix diagnostic.
diff --git a/gcc/testsuite/gfortran.dg/goacc/goacc.exp b/gcc/testsuite/gfortran.dg/goacc/goacc.exp
index f1adb186a1e4..1b093f6d41f5 100644
--- a/gcc/testsuite/gfortran.dg/goacc/goacc.exp
+++ b/gcc/testsuite/gfortran.dg/goacc/goacc.exp
@@ -28,6 +28,31 @@ if ![check_effective_target_fopenacc] {
 # Initialize `dg'.
 dg-init
 
+global gfortran_test_path
+global gfortran_aux_module_flags
+set gfortran_test_path $srcdir/$subdir
+set gfortran_aux_module_flags "-fopenacc"
+proc dg-compile-aux-modules { args } {
+global gfortran_test_path
+global gfortran_aux_module_flags
+if { [llength $args] != 2 } {
+	error "dg-compile-aux-modules: needs one argument"
+	return
+}
+
+set level [info level]
+if { [info procs dg-save-unknown] != [list] } {
+	rename dg-save-unknown dg-save-unknown-level-$level
+}
+
+dg-test $gfortran_test_path/[lindex $args 1] "" $gfortran_aux_module_flags
+# cleanup-modules is intentionally not invoked here.
+
+if { [info procs dg-save-unknown-level-$level] != [list] } {
+	rename dg-save-unknown-level-$level dg-save-unknown
+}
+}
+
 # Main loop.
 gfortran-dg-runtest [lsort \
[find $srcdir/$subdir *.\[fF\]{,90,95,03,08} ] ] "" "-fopenacc"
-- 
2.17.1

From 44ff4c8d4b40c8d9969066b8c38b3df6b76acf17 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:44:45 +
Subject: [PATCH 2/2] [PR72741] Encode OpenACC 'routine' directive's level of
 parallelism inside Fortran module files

If 'use'ing with an old GCC a new module file (with OpenACC 'routine'
directive's level of parallelism encoded), then that expectedly fails as
follows:

f951: Fatal Error: Reading module 'routine_module_mod_1' at line 27 column 21: find_enum(): Enum not found

If 'use'ing with a new GCC an old module file (without OpenACC 'routine'
directive's level of parallelism encoded), then that (silently) continues to
accept the module file, and will proceed with the previous, erroneous behavior.

These seem to be acceptable compromises, instead of incrementing 'MOD_VERSION'.

	gcc/fortran/
	PR fortran/72741
	* module.c (verify_OACC_ROUTINE_LOP_NONE): New function.
	(enum ab_attribute): Add AB_OACC_ROUTINE_LOP_GANG,
	AB_OACC_ROUTINE_LOP_WORKER, AB_OACC_ROUTINE_LOP_VECTOR,
	A

Re: [PR72741, PR89433] Repeated use of the Fortran OpenACC 'routine' directive

2019-03-21 Thread Thomas Schwinge
Hi!

On Thu, 28 Feb 2019 21:37:21 +0100, I wrote:
> On Mon, 15 Aug 2016 18:54:49 -0700, Cesar Philippidis 
>  wrote:
> > [...]
> > 
> > Note that besides for checking for multiple acc routine directives, this
> > patch also handles the case where the optional name argument in 'acc
> > routine (NAME)' is the name of the current procedure. This was a TODO
> > item in gomp4.
> 
> > --- a/gcc/fortran/openmp.c
> > +++ b/gcc/fortran/openmp.c
> 
> > @@ -1969,6 +1971,13 @@ gfc_match_oacc_routine (void)
> >   gfc_current_locus = old_loc;
> >   return MATCH_ERROR;
> > }
> > +
> > + /* Set sym to NULL if it matches the current procedure's
> > +name.  This will simplify the check for duplicate ACC
> > +ROUTINE attributes.  */
> > + if (gfc_current_ns->proc_name
> > + && !strcmp (buffer, gfc_current_ns->proc_name->name))
> > +   sym = NULL;
> > }
> >else
> >  {
> 
> I re-worked the code a bit, didn't find this necessary.

Specifically, a very similar check has already been present, comparing to
'sym->name' instead of 'buffer'.  (Not sure, if one is to be preferred
over the other, when/if they would ever be different.  It feels like
instead of these strings, we should be comparing some kind of symbolic
"resolved" handle, "sym".  And, as it should turn out, I have a cleanup
patch for next GCC development stage 1 to clean up that and other stuff
in 'gfc_match_oacc_routine'.)

Anyway, to clarify, I committed to trunk r269856 "[PR72741] The name in a
Fortran OpenACC 'routine' directive refers to the containing subroutine
or function", see attached.


Grüße
 Thomas


From 467b1bdb6c33711416a3ca270ac51b2b99f2f85b Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 19:54:51 +
Subject: [PATCH] [PR72741] The name in a Fortran OpenACC 'routine' directive
 refers to the containing subroutine or function

	gcc/fortran/
	PR fortran/72741
	* openmp.c (gfc_match_oacc_routine): Clarify.
	gcc/testsuite/
	PR fortran/72741
	* gfortran.dg/goacc/routine-module-mod-1.f90: Update.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269856 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog| 3 +++
 gcc/fortran/openmp.c | 3 +++
 gcc/testsuite/ChangeLog  | 3 +++
 gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90 | 4 ++--
 4 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 2afab3920bda..111e3a266e9b 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,5 +1,8 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/72741
+	* openmp.c (gfc_match_oacc_routine): Clarify.
+
 	PR fortran/72741
 	* module.c (verify_OACC_ROUTINE_LOP_NONE): New function.
 	(enum ab_attribute): Add AB_OACC_ROUTINE_LOP_GANG,
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7a06eb58f5cf..1b1a0b4108fd 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -2314,6 +2314,9 @@ gfc_match_oacc_routine (void)
 	  if (st)
 	{
 	  sym = st->n.sym;
+	  /* If the name in a 'routine' directive refers to the containing
+		 subroutine or function, then make sure that we'll later handle
+		 this accordingly.  */
 	  if (gfc_current_ns->proc_name != NULL
 		  && strcmp (sym->name, gfc_current_ns->proc_name->name) == 0)
 	sym = NULL;
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 8afdf3e980e9..0c94f6bcacf8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,8 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/72741
+	* gfortran.dg/goacc/routine-module-mod-1.f90: Update.
+
 	PR fortran/72741
 	* gfortran.dg/goacc/routine-module-1.f90: New file.
 	* gfortran.dg/goacc/routine-module-2.f90: Likewise.
diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90
index 3855b8c88596..23c673fe3bd1 100644
--- a/gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-module-mod-1.f90
@@ -18,7 +18,7 @@ contains
 
   subroutine s_2
 implicit none
-!$acc routine seq
+!$acc routine (s_2) seq
 
 integer :: i
 
@@ -41,7 +41,7 @@ contains
 
   subroutine w_1
 implicit none
-!$acc routine worker
+!$acc routine (w_1) worker
 
 integer :: i
 
-- 
2.17.1



signature.asc
Description: PGP signature


Re: C++ PATCH for c++/89612 - ICE with member friend template with noexcept

2019-03-21 Thread Jason Merrill

On 3/19/19 11:45 AM, Marek Polacek wrote:

On Thu, Mar 14, 2019 at 04:22:41PM -0400, Jason Merrill wrote:

On 3/7/19 4:52 PM, Marek Polacek wrote:

This was one of those PRs where the more you poke, the more ICEs turn up.
This patch fixes the ones I could find.  The original problem was that
maybe_instantiate_noexcept got a TEMPLATE_DECL created for the member
friend template in do_friend.  Its noexcept-specification was deferred,
so we went to the block with push_access_scope, but that crashes on a
TEMPLATE_DECL.  One approach could be to somehow not defer noexcept-specs
for friend templates, I guess, but I didn't want to do that.



How does it make sense to instantiate the noexcept-specifier of a template?
We should only get there for fully-instantiated function decls.


Hmm, but duplicate_decls calls check_redeclaration_exception_specification even
for DECL_FUNCTION_TEMPLATE_Ps.
...
Note the crash happens in tsubst_friend_function.  I wouldn't know when to
check the noexcept-specifier of such a TEMPLATE_DECL for a template friend
if not there.


Hmm, true, I guess we do need to do a partial instantiation of the 
noexcept-specifier in order to compare it.



That broke in register_parameter_specializations but we don't need this
code anyway, so let's do away with it -- the current_class_{ref,ptr}
code is enough to fix the PR that register_parameter_specializations was
introduced for.


What about uses of non-'this' parameters in the noexcept-specification?

template 
struct C {
  template 
  friend void foo(T t) noexcept(sizeof(decltype(t)) > 1);
};

template 
void foo(int i) noexcept { }

C c;


Lastly, I found an invalid testcase that was breaking because a template code
leaked to constexpr functions.  This I fixed similarly to the recent explicit
PR fix (r269131).


This spot should probably also use build_converted_constant_expr.


Ok, I'll address this.


I'm finding this repeated pattern awkward.  Earlier you changed 
check_narrowing to use maybe_constant_value instead of 
fold_non_dependent_expr, but perhaps whatever that fixed should have 
been fixed instead with a processing_template_decl_sentinel in the 
enclosing code that already instantiated the expression.  That ought to 
avoid any need to change this spot or r269131.


Jason


Re: [PATCH] [PR89773] Fortran OpenACC 'routine' directive refuses procedures with implicit EXTERNAL attribute

2019-03-21 Thread Thomas Schwinge
Hi!

On Wed, 20 Mar 2019 11:07:31 +0100, I wrote:
> On Fri, 26 Aug 2016 08:16:43 -0700, Cesar Philippidis 
>  wrote:
> > While working on [...], I noticed
> 
> If only all such issues would end up in their own PRs, instead of mixing
> them with other changes...
> 
> > that the fortran FE wasn't permitting
> > named functions inside acc routine directives. E.g.
> > 
> >   integer :: foo
> >   !$acc routine(foo) gang
> > 
> >   ... = foo ()
> 
> ACK.  Perhaps not the most pretty style, but gfortran does accept this.
> 
> Do I understand right that there exists no equivalent syntax in Fortran
> to declare a subroutine (instead of a function) with implicit EXTERNAL
> attribute?  (See also the new 'gfortran.dg/goacc/pr89773.f90' test case
> I'm adding.)

(Still interested if there's a way to do that.)

> > This patch also fixes this issue. But to do that, I had to add a
> > gfc_resolve_oacc_routines pass in order to identify if a variable is a
> > function or variable because that information isn't available during
> > matching.
> 
> OK to fix this as in the attached patch?

I convinced myself that my proposed changes are the right thing to do,
and committed to trunk r269857 "[PR89773] Fortran OpenACC 'routine'
directive refuses procedures with implicit EXTERNAL attribute", see
attached.


Grüße
 Thomas


From cbfb10ec630fc5829bdfebbf59ba40d22874c9e2 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 20:02:42 +
Subject: [PATCH] [PR89773] Fortran OpenACC 'routine' directive refuses
 procedures with implicit EXTERNAL attribute

	gcc/fortran/
	PR fortran/89773
	* gfortran.h (gfc_oacc_routine_name): Add loc member.
	(gfc_resolve_oacc_routines): Declare.
	* openmp.c (gfc_match_oacc_routine): Move some error checking
	into...
	(gfc_resolve_oacc_routines): ... this new function.
	* resolve.c (resolve_codes): Call it.
	gcc/testsuite/
	PR fortran/89773
	* gfortran.dg/goacc/pr89773.f90: New file.
	* gfortran.dg/goacc/pr77765.f90: Adjust.
	* gfortran.dg/goacc/routine-6.f90: Adjust, and extend.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269857 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog |  8 +
 gcc/fortran/gfortran.h|  2 ++
 gcc/fortran/openmp.c  | 33 -
 gcc/fortran/resolve.c |  1 +
 gcc/testsuite/ChangeLog   |  5 +++
 gcc/testsuite/gfortran.dg/goacc/pr77765.f90   |  2 +-
 gcc/testsuite/gfortran.dg/goacc/pr89773.f90   | 36 +++
 gcc/testsuite/gfortran.dg/goacc/routine-6.f90 | 21 +--
 8 files changed, 96 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/pr89773.f90

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 111e3a266e9b..7ce67eb46fe7 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,5 +1,13 @@
 2019-03-21  Thomas Schwinge  
 
+	PR fortran/89773
+	* gfortran.h (gfc_oacc_routine_name): Add loc member.
+	(gfc_resolve_oacc_routines): Declare.
+	* openmp.c (gfc_match_oacc_routine): Move some error checking
+	into...
+	(gfc_resolve_oacc_routines): ... this new function.
+	* resolve.c (resolve_codes): Call it.
+
 	PR fortran/72741
 	* openmp.c (gfc_match_oacc_routine): Clarify.
 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 2f55b9c387a6..caf5e528c7e0 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1739,6 +1739,7 @@ typedef struct gfc_oacc_routine_name
   struct gfc_symbol *sym;
   struct gfc_omp_clauses *clauses;
   struct gfc_oacc_routine_name *next;
+  locus loc;
 }
 gfc_oacc_routine_name;
 
@@ -3210,6 +3211,7 @@ void gfc_resolve_oacc_directive (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_declare (gfc_namespace *);
 void gfc_resolve_oacc_parallel_loop_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_blocks (gfc_code *, gfc_namespace *);
+void gfc_resolve_oacc_routines (gfc_namespace *);
 
 /* expr.c */
 void gfc_free_actual_arglist (gfc_actual_arglist *);
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 1b1a0b4108fd..983b83db4a7b 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -2322,15 +2322,10 @@ gfc_match_oacc_routine (void)
 	sym = NULL;
 	}
 
-	  if ((isym == NULL && st == NULL)
-	  || (sym
-		  && !sym->attr.external
-		  && !sym->attr.function
-		  && !sym->attr.subroutine))
+	  if (isym == NULL && st == NULL)
 	{
-	  gfc_error ("Syntax error in !$ACC ROUTINE ( NAME ) at %C, "
-			 "invalid function name %s",
-			 (sym) ? sym->name : buffer);
+	  gfc_error ("Invalid NAME %qs in !$ACC ROUTINE ( NAME ) at %C",
+			 buffer);
 	  gfc_current_locus = old_loc;
 	  return MATCH_ERROR;
 	}
@@ -2400,6 +2395,7 @@ gfc_match_oacc_routine (void)
 	  n->sym = sym;
 	  n->clauses = c;
 	  n->next = gfc_current_ns->oacc_routine_names;
+	  n->loc = old_loc;
 	  gfc_current_ns->oacc_routine_names = n;
 	}
 }
@@ -6072,6 +6068,27 @@ g

Re: C++ PATCH for c++/89705 - ICE with reference binding with conversion function

2019-03-21 Thread Jason Merrill

On 3/16/19 4:53 PM, Marek Polacek wrote:

Here we have code like

   struct X { operator const int(); };
   int&& rri = X();

which I think is invalid, because [dcl.init.ref] says that if types T1 and T2
are reference-related, no qualifiers can be dropped, and if the reference is an
rvalue reference, the initializer expression can't be an lvalue.  And here the
result of the conversion is "const int", so the "const" would be dropped.  A
similar ill-formed test from the standard is

   struct X { operator int&(); };
   int&& rri = X();

where the result of the conversion is an lvalue of related type.  All the
compilers I've tried actually accept the first test, but I think that's wrong.


I don't think it is.  g++ and clang++ reject the first test if you 
change int to a class type, but prvalues of scalar type have no 
cv-qualifiers, so the result of the conversion is a prvalue of type int, 
which is a perfectly good initializer for int&&.


This is OK for the same reason:

  int&& r = (const int)42;

Jason


[PR72741] Properly handle clauses specifying the level of parallelism for 'external' Fortran OpenACC routines (was: [gomp4] check for sufficient parallelism when calling acc routines in fortran)

2019-03-21 Thread Thomas Schwinge
Hi!

On Fri, 26 Aug 2016 08:16:43 -0700, Cesar Philippidis  
wrote:
> This patch

(..., variants of which got re-submitted a few times, later on...)

> teaches the fortran FE how to verify that there is sufficient
> parallelism when calling acc routines inside acc loop. E.g. the fortran
> FE will now error if you call a gang routine from a vector loop, because
> there's no way for vector partitioned code to spawn new gangs with the
> OpenACC current execution model.

These proposed Fortran front end changes seemed strange to me: the
generic middle end OMP code is already doing such checking, and works for
other Fortran test cases.  This should also work for the 'external' case
discussed here; see also the 'c-c++-common/goacc/routine-3-extern.c',
'c-c++-common/goacc/routine-4-extern.c' test cases I'm adding.  Now that
I looked into these proposed changes in more detail, I found that indeed
the error is a different one than what/how it's getting addressed there,
and the solution is much simpler one.

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/routine-nested-parallelism.f
> @@ -0,0 +1,340 @@
> +! Validate calls to ACC ROUTINES.  Ensure that the loop containing the
> +! call has sufficient parallelism to for the routine.
> +
> +  subroutine sub
> +  implicit none
> +  integer, parameter :: n = 100
> +  integer :: a(n), i, j
> +  external gangr, workerr, vectorr, seqr
> +!$acc routine (gangr) gang
> +!$acc routine (workerr) worker
> +!$acc routine (vectorr) vector
> +!$acc routine (seqr) seq
> +
> +!
> +! Test subroutine calls inside nested loops.
> +!
> +
> +!$acc parallel loop
> +  do i = 1, n
> + !$acc loop
> + do j = 1, n

That "!$acc loop" directive is not considered; needs to be in the first
column.

> +call workerr (a, n)
> + end do
> +  end do
> +!$acc end parallel loop
> +
> +!$acc parallel loop
> +  do i = 1, n
> +!$acc loop gang
> + do j = 1, n

If this loop is using OpenACC 'gang' parallelism, then no parallelism is
available for the outer loop; the generic middle end OMP code diagnoses:
"warning: insufficient partitioning available to parallelize loop".
Similar in other such places.

> +call gangr (a, n) ! { dg-error "Insufficient ..ACC LOOP 
> parallelism" }

The generic middle end OMP code diagnoses: "error: routine call uses same
OpenACC parallelism as containing loop".  Similar in other such places.

> +[...]
> +!$acc parallel loop seq
> +  do i = 1, n
> + call gangr (a, n) ! { dg-error "Insufficient ..ACC LOOP 
> parallelism" }
> +  end do
> +!$acc end parallel loop

That's not correct.  The outer loop is tagged 'seq', so not parallelized,
and thus 'gang' parallelism is still available for the 'gangr' routine
call.  Similar in other such places.

> +[...]

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/routine-nested-parallelism.f90

That's the same as 'gfortran.dg/goacc/routine-nested-parallelism.f', just
differing in whitespace; file removed.

For good measure I also created a corresponding test case to "Check valid
calls to 'external' OpenACC routines", and also added
'-fopt-info-optimized-omp' scanning to both these new Fortran test cases.

Committed to trunk r269858 "[PR72741] Properly handle clauses specifying
the level of parallelism for 'external' Fortran OpenACC routines", see
attached.


Grüße
 Thomas


>From 33718c02f44cf560c5725ed1681ae5981acbfc69 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Thu, 21 Mar 2019 20:13:44 +
Subject: [PATCH] [PR72741] Properly handle clauses specifying the level of
 parallelism for 'external' Fortran OpenACC routines

..., so as to also for these enable the generic middle end OMP code to verify
proper nesting of loops/routines regarding their levels of parallelism.

	gcc/fortran/
	PR fortran/72741
	* openmp.c (gfc_match_oacc_routine): Set the level of parallelism
	for all variants.
	(gfc_resolve_oacc_routines): Call gfc_add_omp_declare_target.
	gcc/testsuite/
	PR fortran/72741
	* c-c++-common/goacc/routine-3-extern.c: New file.
	* c-c++-common/goacc/routine-3.c: Adjust.
	* c-c++-common/goacc/routine-4-extern.c: New file.
	* c-c++-common/goacc/routine-4.c: Adjust.
	* gfortran.dg/goacc/routine-module-3.f90: New file.
	* gfortran.dg/goacc/routine-external-level-of-parallelism-1.f: New
	file.
	* gfortran.dg/goacc/routine-external-level-of-parallelism-2.f:
	Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269858 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog |   5 +
 gcc/fortran/openmp.c  |   8 +
 gcc/testsuite/ChangeLog   |  16 +
 .../c-c++-common/goacc/routine-3-extern.c |  89 +
 gcc/testsuite/c-c++-common/goacc/routine-3.c  |   1 +
 .../c-c++-common/goacc/routine-4-extern.c | 124 ++
 gcc/testsuite/c-c++-common/goacc/routine-4.c  |   1 +
 .../routine-external-level-of-parallelism-1.f | 347 +
 .../routi

Re: [PR72741] Encode OpenACC 'routine' directive's level of parallelism inside Fortran module files

2019-03-21 Thread Thomas Koenig

Hi Thomas,


Are there any further questions, or am I good to commit my patch as
posted?


Problem is, I have never looked into module writing / reading in any
detail, so it will take a few days for me to get up to speed so I can
really review your patch.

If, in the meantime, maybe somebody else with more knowledge in that
area (Janne?) could step in.

Regards

Thomas


C++ PATCH for c++/89214 - ICE when initializing aggregates with bases

2019-03-21 Thread Marek Polacek
This is a crash in digest_init_r -- we encounter

  /* "If T is a class type and the initializer list has a single
 element of type cv U, where U is T or a class derived from T,
 the object is initialized from that element."  */
  if (flag_checking
  && cxx_dialect >= cxx11
  && BRACE_ENCLOSED_INITIALIZER_P (stripped_init)
  && CONSTRUCTOR_NELTS (stripped_init) == 1
  && ((CLASS_TYPE_P (type) && !CLASSTYPE_NON_AGGREGATE (type))
  || VECTOR_TYPE_P (type)))
{
  tree elt = CONSTRUCTOR_ELT (stripped_init, 0)->value;
  if (reference_related_p (type, TREE_TYPE (elt)))
/* We should have fixed this in reshape_init.  */
gcc_unreachable ();
}

As the comment suggests, we have code to fix this up in reshape_init_r:

  /* "If T is a class type and the initializer list has a single element of
 type cv U, where U is T or a class derived from T, the object is
 initialized from that element."  Even if T is an aggregate.  */
  if (cxx_dialect >= cxx11 && (CLASS_TYPE_P (type) || VECTOR_TYPE_P (type))
  && first_initializer_p
  && d->end - d->cur == 1
  && reference_related_p (type, TREE_TYPE (init)))
{
  d->cur++;
  return init; 
}

but in this case this didn't work, because reshape_init_class always creates
a fresh CONSTRUCTOR.  As of C++17, aggregates can have bases, so we called
reshape_init_class on the initializer for the public base, and returned a
constructor whose single element was of a reference-related type.  We can
check for this case and fix it up, as in the below.

I came up with testcases that test both cases of "where U is T *or* a class
derived from T".

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?

2019-03-21  Marek Polacek  

PR c++/89214 - ICE when initializing aggregates with bases.
* decl.c (reshape_init_class): Adjust commentary.  If the initializer
list has a single element of the same or derived type, return the
element itself, not wrapped in a constructor.

* g++.dg/cpp1z/aggr-base8.C: New test.
* g++.dg/cpp1z/aggr-base9.C: New test.

diff --git gcc/cp/decl.c gcc/cp/decl.c
index c8435e29491..817e33e40d2 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -5873,7 +5873,7 @@ reshape_init_vector (tree type, reshape_iter *d, 
tsubst_flags_t complain)
 }
 
 /* Subroutine of reshape_init_r, processes the initializers for classes
-   or union. Parameters are the same of reshape_init_r.  */
+   or union.  Parameters are the same of reshape_init_r.  */
 
 static tree
 reshape_init_class (tree type, reshape_iter *d, bool first_initializer_p,
@@ -5884,7 +5884,8 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
 
   gcc_assert (CLASS_TYPE_P (type));
 
-  /* The initializer for a class is always a CONSTRUCTOR.  */
+  /* The initializer for a class is always a CONSTRUCTOR.  Unless it
+ has a single element of the same or derived type, see below.  */
   new_init = build_constructor (init_list_type_node, NULL);
   field = next_initializable_field (TYPE_FIELDS (type));
 
@@ -5978,6 +5979,19 @@ reshape_init_class (tree type, reshape_iter *d, bool 
first_initializer_p,
   field = next_initializable_field (DECL_CHAIN (field));
 }
 
+  /* "If T is a class type and the initializer list has a single
+ element of type cv U, where U is T or a class derived from T,
+ the object is initialized from that element."  But this function
+ always builds up a new CONSTRUCTOR, so undo that here.  */
+  if (cxx_dialect >= cxx11
+  && BRACE_ENCLOSED_INITIALIZER_P (new_init)
+  && CONSTRUCTOR_NELTS (new_init) == 1)
+{
+  tree elt = CONSTRUCTOR_ELT (new_init, 0)->value;
+  if (reference_related_p (type, TREE_TYPE (elt)))
+   return elt;
+}
+
   return new_init;
 }
 
diff --git gcc/testsuite/g++.dg/cpp1z/aggr-base8.C 
gcc/testsuite/g++.dg/cpp1z/aggr-base8.C
new file mode 100644
index 000..10cff027c39
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1z/aggr-base8.C
@@ -0,0 +1,48 @@
+// PR c++/89214
+// { dg-do compile { target c++17 } }
+
+struct A
+{
+  A (int);
+};
+
+struct BB
+{
+  A a;
+};
+
+struct B : BB
+{
+};
+
+void
+foo ()
+{
+  B b1 = {42};
+  B b2 = {{42}};
+  B b3 = {{{42}}};
+
+  B b4 = B{42};
+  B b5 = B{{42}};
+  B b6 = B{{{42}}};
+
+  B b7 = {B{42}};
+  B b8 = {B{{42}}};
+  B b9 = {B{{{42;
+
+  B b10 = {{B{42}}};
+  B b11 = {{B{{42;
+  B b12 = {{B{{{42};
+
+  B bb1{42};
+  B bb2{{42}};
+  B bb3{{{42}}};
+
+  B bb7{B{42}};
+  B bb8{B{{42}}};
+  B bb9{B{{{42;
+
+  B bb10{{B{42}}};
+  B bb11{{B{{42;
+  B bb12{{B{{{42};
+}
diff --git gcc/testsuite/g++.dg/cpp1z/aggr-base9.C 
gcc/testsuite/g++.dg/cpp1z/aggr-base9.C
new file mode 100644
index 000..a3fa747efa4
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp1z/aggr-base9.C
@@ -0,0 +1,40 @@
+// PR c++/89214
+// { dg-do compile { target c++17 } }
+
+struct B {
+  int c;
+};
+
+struct D : B { };
+
+void
+foo ()
+{
+  D d1 = {4

Re: C++ PATCH for c++/89705 - ICE with reference binding with conversion function

2019-03-21 Thread Marek Polacek
On Thu, Mar 21, 2019 at 04:13:29PM -0400, Jason Merrill wrote:
> On 3/16/19 4:53 PM, Marek Polacek wrote:
> > Here we have code like
> > 
> >struct X { operator const int(); };
> >int&& rri = X();
> > 
> > which I think is invalid, because [dcl.init.ref] says that if types T1 and 
> > T2
> > are reference-related, no qualifiers can be dropped, and if the reference 
> > is an
> > rvalue reference, the initializer expression can't be an lvalue.  And here 
> > the
> > result of the conversion is "const int", so the "const" would be dropped.  A
> > similar ill-formed test from the standard is
> > 
> >struct X { operator int&(); };
> >int&& rri = X();
> > 
> > where the result of the conversion is an lvalue of related type.  All the
> > compilers I've tried actually accept the first test, but I think that's 
> > wrong.
> 
> I don't think it is.  g++ and clang++ reject the first test if you change
> int to a class type, but prvalues of scalar type have no cv-qualifiers, so
> the result of the conversion is a prvalue of type int, which is a perfectly
> good initializer for int&&.
> 
> This is OK for the same reason:
> 
>   int&& r = (const int)42;

Oop, this is embarassing, sorry.  So I guess we're not handling the (5.3.2)
case in [dcl.init.ref] properly: "If the converted initializer is a prvalue,
its type T4 is adjusted to type “cv1 T4” ([conv.qual]) and the temporary
materialization conversion ([conv.rval]) is applied.

Marek


Re: [PATCH] [PR89773] Fortran OpenACC 'routine' directive refuses procedures with implicit EXTERNAL attribute

2019-03-21 Thread Steve Kargl
On Thu, Mar 21, 2019 at 09:04:21PM +0100, Thomas Schwinge wrote:
> Hi!
> 
> On Wed, 20 Mar 2019 11:07:31 +0100, I wrote:
> > On Fri, 26 Aug 2016 08:16:43 -0700, Cesar Philippidis 
> >  wrote:
> > > While working on [...], I noticed
> > 
> > If only all such issues would end up in their own PRs, instead of mixing
> > them with other changes...
> > 
> > > that the fortran FE wasn't permitting
> > > named functions inside acc routine directives. E.g.
> > > 
> > >   integer :: foo
> > >   !$acc routine(foo) gang
> > > 
> > >   ... = foo ()
> > 
> > ACK.  Perhaps not the most pretty style, but gfortran does accept this.
> > 
> > Do I understand right that there exists no equivalent syntax in Fortran
> > to declare a subroutine (instead of a function) with implicit EXTERNAL
> > attribute?  (See also the new 'gfortran.dg/goacc/pr89773.f90' test case
> > I'm adding.)
> 
> (Still interested if there's a way to do that.)
> 

You're question make so sense to me as it seems you're 
conflating IMPLICIT type with implicit interface.  The
F2018 standard has

  15.4.2.1 Interfaces and scopes

  The interface of a procedure is either explicit or implicit.
  It is explicit if it is 
. an internal procedure, module procedure, or intrinsic procedure,
· a subroutine, or a function with a separate result name, within
  the scoping unit that defines it, or
· a procedure declared by a procedure declaration statement that
  specifies an explicit interface, or by an interface body.

  Otherwise, the interface of the identifier is implicit.  The interface
  of a statement function is always implicit.

These two program are equivalent

program foo
  external bar
  external bah
  call bar(a,i)
  x = bah()
end program

program foo
  call bar(a,i)
  x = bah()
end program

except the former explicitly tells the compiler that bar and bah
are external.  The "return type" with respect to C of bar is always
void.  All subroutines with either an explicit or implicit interface
have a void "return type".  The return type for bah() is *implicitly*
determined by the 'b' in the function function name.  In this case,
bar() has an implcit type of REAL.

-- 
Steve


Re: [C++ PATCH] Fix thread_local initialization (PR c++/60702)

2019-03-21 Thread Jason Merrill

On 3/15/19 4:53 PM, Jakub Jelinek wrote:

Hi!

As the testcase shows, we replace TLS vars that need initialization
in finish_id_expression_1 and in tsubst_copy_and_build of a VAR_DECL
with a _ZTW* call, but miss it in other cases where finish_id_expression
is not actually called.  In particular build_class_member_access_expr
when we do object.static_tls_data_member access doesn't wrap it, and
when using some_class::static_tls_data_member in a teplate, we go
through finish_qualified_id_expr and not finish_id_expression either
(and tsubst_copy in that case, so it doesn't go through
tsubst_copy_and_build).

The following patch fixes that.  While it is not a regression, it is a
serious wrong-code issue that just gained 6th reporter of the same issue.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-15  Jakub Jelinek  

PR c++/60702
* semantics.c (finish_qualified_id_expr): Handle accesses to TLS
variables.
* typeck.c (build_class_member_access_expr): Likewise.

* g++.dg/tls/thread_local11.C: New test.
* g++.dg/tls/thread_local11.h: New test.
* g++.dg/tls/thread_local12a.C: New test.
* g++.dg/tls/thread_local12b.C: New test.
* g++.dg/tls/thread_local12c.C: New test.
* g++.dg/tls/thread_local12d.C: New test.
* g++.dg/tls/thread_local12e.C: New test.
* g++.dg/tls/thread_local12f.C: New test.
* g++.dg/tls/thread_local12g.C: New test.
* g++.dg/tls/thread_local12h.C: New test.
* g++.dg/tls/thread_local12i.C: New test.
* g++.dg/tls/thread_local12j.C: New test.
* g++.dg/tls/thread_local12k.C: New test.
* g++.dg/tls/thread_local12l.C: New test.

--- gcc/cp/semantics.c.jj   2019-03-14 09:14:16.718012031 +0100
+++ gcc/cp/semantics.c  2019-03-15 16:53:14.270384477 +0100
@@ -2135,6 +2135,17 @@ finish_qualified_id_expr (tree qualifyin
expr = build_qualified_name (TREE_TYPE (expr),
 qualifying_class, expr,
 template_p);
+  else if (VAR_P (expr)
+  && !processing_template_decl
+  && !cp_unevaluated_operand
+  && (TREE_STATIC (expr) || DECL_EXTERNAL (expr))
+  && CP_DECL_THREAD_LOCAL_P (expr))
+   {
+ if (tree wrap = get_tls_wrapper_fn (expr))
+   /* Replace an evaluated use of the thread_local variable with
+  a call to its wrapper.  */
+   expr = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
+   }


Let's factor this pattern out into another function.  And we don't need 
to check TREE_STATIC || DECL_EXTERNAL anymore.


Jason


[PATCH] rs6000: Fix typo in mmintrin.h

2019-03-21 Thread Bill Schmidt
Hi,

It was recently pointed out that there's a pasto in mmintrin.h for the
_mm_sub_pi32 function, so that it performs an addition rather than a
subtraction.  This won't do.

This patch corrects the problem, and adds a test case to verify it.
Installed and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this ok for trunk, and backport to GCC 8?

Thanks!
Bill


[gcc]

2019-03-21  Bill Schmidt  

* config/rs6000/mmintrin.h (_mm_sub_pi32): Fix typo.

[gcc/testsuite]

2019-03-21  Bill Schmidt  

* gcc.target/powerpc/mmx-psubd-2.c: Test _m_psubd.


Index: gcc/config/rs6000/mmintrin.h
===
--- gcc/config/rs6000/mmintrin.h(revision 269843)
+++ gcc/config/rs6000/mmintrin.h(working copy)
@@ -597,7 +597,7 @@ _mm_sub_pi32 (__m64 __m1, __m64 __m2)
 extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _m_psubd (__m64 __m1, __m64 __m2)
 {
-  return _mm_add_pi32 (__m1, __m2);
+  return _mm_sub_pi32 (__m1, __m2);
 }
 
 extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
Index: gcc/testsuite/gcc.target/powerpc/mmx-psubd-2.c
===
--- gcc/testsuite/gcc.target/powerpc/mmx-psubd-2.c  (revision 269843)
+++ gcc/testsuite/gcc.target/powerpc/mmx-psubd-2.c  (working copy)
@@ -22,20 +22,28 @@ test (__m64 s1, __m64 s2)
   return _mm_sub_pi32 (s1, s2);
 }
 
+static __m64
+__attribute__((noinline, unused))
+test_alias (__m64 s1, __m64 s2)
+{
+  return _m_psubd (s1, s2);
+}
+
 static void
 TEST (void)
 {
   __m64_union u, s1, s2;
-  __m64_union e;
+  __m64_union e, v;
   int i;

   s1.as_m64 = _mm_setr_pi32 (30, 90);
   s2.as_m64 = _mm_setr_pi32 (76, -100);
   u.as_m64 = test (s1.as_m64, s2.as_m64);
-   
+  v.as_m64 = test_alias (s1.as_m64, s2.as_m64);
+
   for (i = 0; i < 2; i++)
  e.as_int[i] = s1.as_int[i] - s2.as_int[i];
 
-  if (u.as_m64 != e.as_m64)
+  if (u.as_m64 != e.as_m64 || u.as_m64 != v.as_m64)
 abort ();
 }



Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Martin Sebor

On 3/19/19 9:33 PM, Jeff Law wrote:

On 3/19/19 8:22 PM, Joseph Myers wrote:

On Tue, 19 Mar 2019, Jeff Law wrote:


I'll note that our documentation clearly states that attributes can be
applied to functions, variables, labels, enums, statements and types.


A key thing here is that they can be applied to fields - that is, they can
be properties of a FIELD_DECL.  Referring to a field either requires an
expression referring to that field, or some other custom syntax that uses
both a type name and a field name (like in __builtin_offsetof).


Thanks for chiming in, your opinions on this kind of thing are greatly
appreciated.

Understood WRT applying to fields, and I think that's consistent with
some of what Jakub has expressed elsewhere -- specifically that we
should consider allowing COMPONENT_REFs as the exception, returning the
attributes of the associated FIELD_DECL in that case.

Is there a need to call out BIT_FIELD_REFs where we can't actually get
to the underlying DECL?  And if so, how do we do that in the end user
documentation that is clear and consistent?


Is it possible to see a BIT_FIELD_REF here?  I couldn't find a way.



One of the big worries I've got here is that documenting when an
expression as an argument to __builtin_has_attribute (or any attribute
query) can be expected to work.  It's always easier on our end users to
loosen semantics of extensions over time than it is to tighten them.


I wonder if a part of the issue isn't due to a mismatch between
the C terminology and what GCC uses internally.  Every argument
to the built-in that's not a type (and every argument to attribute
copy which doesn't accept types) must be a C expression:

1) either an identifier naming a function or variable, or
2) some other expression like a member reference via . or ->,
   an array subscript, or the indirection expression *.

But GCC distinguishes three kinds of arguments:

1) a DECL,
2) some sort of a reference like ARRAY_REF, COMPONENT_REF or
   INDIRECT_REF
3) an expression that satisfies the EXPR_P() predicate (e.g.,
   (struct S*)0, or (struct S){ 1 })

Jeff, you seem to want the built-in to accept just (1) on the GCC
list above and reject (3) (and seem to be waffling on (2)).

How would such an argument be described in a way that users
unfamiliar with GCC internals could easily understand?

All sorts of expressions can be used to refer to fields.  Given
the type definition 'struct A { int b[2]; };' besides
FUNCTION_DECL, PARM_DECL, and VAR_DECL we might expect to commonly
see arguments like:

COMPONENT_REF:
  ((struct A*)0)->b
  (*((struct A*)0)).b
  ((struct A*)0)[0].b

INDIRECT_REF:
  *((struct A*)0)[0].b

ARRAY_REF:
  ((struct A*)0)[0].b[0]

To users, they're all just expressions.

Fields aside, the expressions below seem quite plausible to me
too, especially when the built-in argument is the result of macro
expansion:

CALL_EXPR:
  foo ()

COMPOUND_LITERAL_EXPR:
  (struct A){ ... }

COND_EXPR:
  (0 ? a0 : a1)

COMPOUND_EXPR:
  ((void)0, a)

NON_LVALUE_EXPR:
  (struct A)a

etc.  In these cases the built-in could be passed the result of
__typeof__ instead, but if the built-in itself is part of a macro
that can be invoked either with one of the arguments on the _REF
list or with one of these expressions, it would only work as
expected for half of them.

Using __typeof__ doesn't work for the copy attribute because it
only takes expressions as arguments and not types.

Martin


Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Thomas Rodgers


20190321-pstl-integration.patch.bz2
Description: pstl integration patch

Jonathan Wakely writes:

> On 20/03/19 14:05 -0700, Thomas Rodgers wrote:
>
>>
>>Fixed a failing test.
>
> Thanks. Apart from the changelog issue I mentioned on IRC, the only
> other required change I see is in include/Makefile.am:
>
> @@ -1480,7 +1512,9 @@ install-headers:
>   $(mkinstalldirs) $(DESTDIR)${host_installdir}/../ext
>   for file in ${ext_host_headers}; do \
> $(INSTALL_DATA) $${file} $(DESTDIR)${host_installdir}/../ext; done
> -
> + $(mkinstalldirs) ($DESTDIR)${gxx_include_dir}/${pstl_builddir}
>
> This needs to be $(DESTDIR) not ($DESTDIR).
>
> ***
>
> I also see a couple of things that we probably can't fix for now, but
> I'll mention them anyway for later consideration.
>
> In  there's a local include using ""
> instead of <> (see PR libstdc++/88066 for a similar case I changed
> recently):
>
> #include "execution_defs.h"
>
> We could fix that one, but then *all* the relative includes in
> include/pstl/*  use "" too. Do you think upstream would change to use
> #include  instead of #include "blah.h" ? It looks like
> upstream uses  instead so maybe that's something to
> change in your script to import the upstream code.
>
> Even if we fix it, TBB headers use "" instead of <>.  And it might not
> be a problem for any compiler except GCC, because only we support the
> silly -I- option that makes this matter.
>
>
> I also noticed that simply including  (even if you don't
> use anything from it) will cause a link-time dependency on TBB:
>
> /usr/bin/ld: /tmp/ccd7w5Oo.o: in function 
> `tbb::interface7::task_arena::current_thread_index()':
> /usr/include/tbb/task_arena.h:358: undefined reference to 
> `tbb::interface7::internal::task_arena_base::internal_current_slot()'
> collect2: error: ld returned 1 exit status
>
> This dependency comes from  which is included by
>  which is included by .
>
> I think that's OK for now, as no existing code using GCC includes the
>  header (because it doesn't exist yet). It would be good if
> we can find a way to solve that eventually though. Maybe the eventual
> solution will just be a non-TBB backend using OpenMP.
>
> Exporting something like this from libstdc++.so would work, but is
> disgusting (and presumably "tbb::interface7" changes from release to
> release?"):
>
> extern "C" [[__gnu__::__weak__]]
> int
> _ZN3tbb10interface78internal15task_arena_base21internal_current_slotEv()
> { return -1; }



Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 03:59:55PM -0600, Martin Sebor wrote:
> 1) either an identifier naming a function or variable, or
> 2) some other expression like a member reference via . or ->,
>an array subscript, or the indirection expression *.
> 
> But GCC distinguishes three kinds of arguments:
> 
> 1) a DECL,
> 2) some sort of a reference like ARRAY_REF, COMPONENT_REF or
>INDIRECT_REF
> 3) an expression that satisfies the EXPR_P() predicate (e.g.,
>(struct S*)0, or (struct S){ 1 })
> 
> Jeff, you seem to want the built-in to accept just (1) on the GCC
> list above and reject (3) (and seem to be waffling on (2)).
> 
> How would such an argument be described in a way that users
> unfamiliar with GCC internals could easily understand?

Say that the argument is either a type or an expression that is
either an identifier (for C++ id-expression) to cover 1) and
a postfix expression with . or -> operator (to cover COMPONENT_REF)?
We do not want to allow INDIRECT_REF, ARRAY_REF, etc.

Jakub


Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Thomas Rodgers

Fixed up change log.


20190321-1-pstl-integration.patch.bz2
Description: pstl integration patch

Thomas Rodgers writes:

> Jonathan Wakely writes:
>
>> On 20/03/19 14:05 -0700, Thomas Rodgers wrote:
>>
>>>
>>>Fixed a failing test.
>>
>> Thanks. Apart from the changelog issue I mentioned on IRC, the only
>> other required change I see is in include/Makefile.am:
>>
>> @@ -1480,7 +1512,9 @@ install-headers:
>>  $(mkinstalldirs) $(DESTDIR)${host_installdir}/../ext
>>  for file in ${ext_host_headers}; do \
>>$(INSTALL_DATA) $${file} $(DESTDIR)${host_installdir}/../ext; done
>> -
>> +$(mkinstalldirs) ($DESTDIR)${gxx_include_dir}/${pstl_builddir}
>>
>> This needs to be $(DESTDIR) not ($DESTDIR).
>>
>> ***
>>
>> I also see a couple of things that we probably can't fix for now, but
>> I'll mention them anyway for later consideration.
>>
>> In  there's a local include using ""
>> instead of <> (see PR libstdc++/88066 for a similar case I changed
>> recently):
>>
>> #include "execution_defs.h"
>>
>> We could fix that one, but then *all* the relative includes in
>> include/pstl/*  use "" too. Do you think upstream would change to use
>> #include  instead of #include "blah.h" ? It looks like
>> upstream uses  instead so maybe that's something to
>> change in your script to import the upstream code.
>>
>> Even if we fix it, TBB headers use "" instead of <>.  And it might not
>> be a problem for any compiler except GCC, because only we support the
>> silly -I- option that makes this matter.
>>
>>
>> I also noticed that simply including  (even if you don't
>> use anything from it) will cause a link-time dependency on TBB:
>>
>> /usr/bin/ld: /tmp/ccd7w5Oo.o: in function 
>> `tbb::interface7::task_arena::current_thread_index()':
>> /usr/include/tbb/task_arena.h:358: undefined reference to 
>> `tbb::interface7::internal::task_arena_base::internal_current_slot()'
>> collect2: error: ld returned 1 exit status
>>
>> This dependency comes from  which is included by
>>  which is included by .
>>
>> I think that's OK for now, as no existing code using GCC includes the
>>  header (because it doesn't exist yet). It would be good if
>> we can find a way to solve that eventually though. Maybe the eventual
>> solution will just be a non-TBB backend using OpenMP.
>>
>> Exporting something like this from libstdc++.so would work, but is
>> disgusting (and presumably "tbb::interface7" changes from release to
>> release?"):
>>
>> extern "C" [[__gnu__::__weak__]]
>> int
>> _ZN3tbb10interface78internal15task_arena_base21internal_current_slotEv()
>> { return -1; }



Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Martin Sebor

On 3/21/19 4:13 PM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 03:59:55PM -0600, Martin Sebor wrote:

1) either an identifier naming a function or variable, or
2) some other expression like a member reference via . or ->,
an array subscript, or the indirection expression *.

But GCC distinguishes three kinds of arguments:

1) a DECL,
2) some sort of a reference like ARRAY_REF, COMPONENT_REF or
INDIRECT_REF
3) an expression that satisfies the EXPR_P() predicate (e.g.,
(struct S*)0, or (struct S){ 1 })

Jeff, you seem to want the built-in to accept just (1) on the GCC
list above and reject (3) (and seem to be waffling on (2)).

How would such an argument be described in a way that users
unfamiliar with GCC internals could easily understand?


Say that the argument is either a type or an expression that is
either an identifier (for C++ id-expression) to cover 1) and
a postfix expression with . or -> operator (to cover COMPONENT_REF)?


That doesn't look easy to understand.


We do not want to allow INDIRECT_REF, ARRAY_REF, etc.


Why not?  What exactly is the concern?

Martin


Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 04:19:37PM -0600, Martin Sebor wrote:
> > Say that the argument is either a type or an expression that is
> > either an identifier (for C++ id-expression) to cover 1) and
> > a postfix expression with . or -> operator (to cover COMPONENT_REF)?
> 
> That doesn't look easy to understand.

Why?  Those users that don't read corresponding language standards
will just see the compiler diagnosing any uses but those 3 kinds
and can then just read the documentation which will show in examples what is
accepted and what it will do.

> > We do not want to allow INDIRECT_REF, ARRAY_REF, etc.
> 
> Why not?  What exactly is the concern?

Because only the . and -> operators are needed to get at the non-static
member declaration.  INDIRECT_REF or ARRAY_REF aren't useful for that,
and then it raises the question what exactly is supposed to be the behavior
when you use *expr or expr[0] or expr->field[0].  Those expression don't
have any decl, so we'd be back at your suggestion to pretty randomly
sometimes return DECL_ATTRIBUTES, sometimes TYPE_ATTRIBUTES, sometimes both.
That is just a bad design.  If people want type attributes, they should use
__typeof or decltype or just type itself in the argument.

Jakub


Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Jonathan Wakely

On 21/03/19 15:19 -0700, Thomas Rodgers wrote:


Fixed up change log.


Thanks, this version seems to have addressed everything.

As this was one of our big ticket features for stage 1 (but was
delayed due to the necessary legal steps that Intel took to kindly
transfer this to the LLVM project) we still want to go ahead and
commit this, even though it's late in stage 4.

There's some risk in adding this much code this late, but the
additions to existing headers are all guarded with
#if __cplusplus > 201402L
so it only affects C++17 mode (not the C++14 defaults) and will be a
no-op for most users.

So I'm going to approve this for trunk (and Thomas and I are committed
to dealing with any problems that might show up as soon as they
happen).




Re: [C++ PATCH] Fix thread_local initialization (PR c++/60702)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 05:03:11PM -0400, Jason Merrill wrote:
> > --- gcc/cp/semantics.c.jj   2019-03-14 09:14:16.718012031 +0100
> > +++ gcc/cp/semantics.c  2019-03-15 16:53:14.270384477 +0100
> > @@ -2135,6 +2135,17 @@ finish_qualified_id_expr (tree qualifyin
> > expr = build_qualified_name (TREE_TYPE (expr),
> >  qualifying_class, expr,
> >  template_p);
> > +  else if (VAR_P (expr)
> > +  && !processing_template_decl
> > +  && !cp_unevaluated_operand
> > +  && (TREE_STATIC (expr) || DECL_EXTERNAL (expr))
> > +  && CP_DECL_THREAD_LOCAL_P (expr))
> > +   {
> > + if (tree wrap = get_tls_wrapper_fn (expr))
> > +   /* Replace an evaluated use of the thread_local variable with
> > +  a call to its wrapper.  */
> > +   expr = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
> > +   }
> 
> Let's factor this pattern out into another function.  And we don't need to
> check TREE_STATIC || DECL_EXTERNAL anymore.

So, do you want say:
tree
maybe_get_tls_wrapper_fn (tree expr)
{
  if (VAR_P (expr)
  && !processing_template_decl
  && !cp_unevaluated_operand
  && CP_DECL_THREAD_LOCAL_P (expr))
return get_tls_wrapper_fn (expr);
  return NULL;
}
and use it like:
  else if (tree wrap = maybe_get_tls_wrapper_fn (expr))
/* Replace an evaluated use of the thread_local variable with
   a call to its wrapper.  */
expr = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
  else
...
or even move the build_cxx_call into that function (then I don't know how to
call it, but what's worse, many of the spots where it is used want to bypass
further processing if we've replaced the TLS var with the call, so if the
build_cxx_call was in there, we'd need a function that returns two values,
so perhaps returning a bool and take the expr as reference, but I believe
current coding guidelines are against doing something like that.

For the TREE_STATIC || DECL_EXTERNAL, I assume it is safe to drop it in the
current two spots (pt.c and semantics.c) too, I've looked where we set
CP_DECL_THREAD_LOCAL_P flags and seems it should always have one of those
other flags set at that point too.

Jakub


Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Martin Sebor

On 3/21/19 4:25 PM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 04:19:37PM -0600, Martin Sebor wrote:

Say that the argument is either a type or an expression that is
either an identifier (for C++ id-expression) to cover 1) and
a postfix expression with . or -> operator (to cover COMPONENT_REF)?


That doesn't look easy to understand.


Why?  Those users that don't read corresponding language standards
will just see the compiler diagnosing any uses but those 3 kinds
and can then just read the documentation which will show in examples what is
accepted and what it will do.


Because to most users it suggests that . and -> are the only operators
that are accepted in the expression and so something like p[0].a is not
a valid argument.

Those who don't read the manual will be puzzled when, after having had
p[0].a accepted, p->a[0] will trigger an error.


We do not want to allow INDIRECT_REF, ARRAY_REF, etc.


Why not?  What exactly is the concern?


Because only the . and -> operators are needed to get at the non-static
member declaration.  INDIRECT_REF or ARRAY_REF aren't useful for that,
and then it raises the question what exactly is supposed to be the behavior
when you use *expr or expr[0] or expr->field[0].  Those expression don't
have any decl, so we'd be back at your suggestion to pretty randomly
sometimes return DECL_ATTRIBUTES, sometimes TYPE_ATTRIBUTES, sometimes both.
That is just a bad design.  If people want type attributes, they should use
__typeof or decltype or just type itself in the argument.


I never suggested to "pretty randomly sometimes return
DECL_ATTRIBUTES, sometimes TYPE_ATTRIBUTES, or sometimes both."
That's a silly mischaracterization, just as it would be to say
the same thing about __alignof__, and it works much the same way.
It operates on expressions and returns the alignment either
explicitly speciified for the variable (or function) or that
specified for its type.

I modeled the built-in on __alignof__.  There may bugs where it
behaves differently, or subtle deviations where I chose a different
approach, but it's certainly not random.  I'd like to fix the bugs,
and I'm open to reconsidering the deviations, but using either as
a justification for rejecting those kinds of expressions would
result in hard-to-use API.  And _that_ would be bad design.

Martin


Re: [C++ PATCH] Add -fconstexpr-ops-limit= option (PR c++/87481)

2019-03-21 Thread Jason Merrill

On 3/15/19 4:07 PM, Jakub Jelinek wrote:

+/* Number of cxx_eval_constant_expression calls (except skipped ones,
+   on simple constants or location wrappers) encountered during current
+   cxx_eval_outermost_constant_expr call.  */
+static HOST_WIDE_INT constexpr_ops_count;


Hmm, a global static variable is non-reentrant.  This may not be an 
issue in practice because of instantiate_constexpr_fns, but still seems 
questionable.


Jason


Re: [C++ PATCH] Fix thread_local initialization (PR c++/60702)

2019-03-21 Thread Jason Merrill

On 3/21/19 6:55 PM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 05:03:11PM -0400, Jason Merrill wrote:

--- gcc/cp/semantics.c.jj   2019-03-14 09:14:16.718012031 +0100
+++ gcc/cp/semantics.c  2019-03-15 16:53:14.270384477 +0100
@@ -2135,6 +2135,17 @@ finish_qualified_id_expr (tree qualifyin
expr = build_qualified_name (TREE_TYPE (expr),
 qualifying_class, expr,
 template_p);
+  else if (VAR_P (expr)
+  && !processing_template_decl
+  && !cp_unevaluated_operand
+  && (TREE_STATIC (expr) || DECL_EXTERNAL (expr))
+  && CP_DECL_THREAD_LOCAL_P (expr))
+   {
+ if (tree wrap = get_tls_wrapper_fn (expr))
+   /* Replace an evaluated use of the thread_local variable with
+  a call to its wrapper.  */
+   expr = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
+   }


Let's factor this pattern out into another function.  And we don't need to
check TREE_STATIC || DECL_EXTERNAL anymore.


So, do you want say:
tree
maybe_get_tls_wrapper_fn (tree expr)
{
   if (VAR_P (expr)
   && !processing_template_decl
   && !cp_unevaluated_operand
   && CP_DECL_THREAD_LOCAL_P (expr))
 return get_tls_wrapper_fn (expr);
   return NULL;
}
and use it like:
   else if (tree wrap = maybe_get_tls_wrapper_fn (expr))
 /* Replace an evaluated use of the thread_local variable with
a call to its wrapper.  */
 expr = build_cxx_call (wrap, 0, NULL, tf_warning_or_error);
   else
 ...
or even move the build_cxx_call into that function (then I don't know how to
call it, but what's worse, many of the spots where it is used want to bypass
further processing if we've replaced the TLS var with the call, so if the
build_cxx_call was in there, we'd need a function that returns two values,
so perhaps returning a bool and take the expr as reference, but I believe
current coding guidelines are against doing something like that.


How about

else if (tree wrap = maybe_call_tls_wrapper_fn (expr))
  expr = wrap;

?


For the TREE_STATIC || DECL_EXTERNAL, I assume it is safe to drop it in the
current two spots (pt.c and semantics.c) too


Yes.  That check was from back before the addition of 
CP_DECL_THREAD_LOCAL_P, which doesn't need this guard.  Nor does 
DECL_THREAD_LOCAL_P any more, it seems.


Jason


Re: C++ PATCH for c++/89705 - ICE with reference binding with conversion function

2019-03-21 Thread Jason Merrill

On 3/21/19 4:55 PM, Marek Polacek wrote:

On Thu, Mar 21, 2019 at 04:13:29PM -0400, Jason Merrill wrote:

On 3/16/19 4:53 PM, Marek Polacek wrote:

Here we have code like

struct X { operator const int(); };
int&& rri = X();

which I think is invalid, because [dcl.init.ref] says that if types T1 and T2
are reference-related, no qualifiers can be dropped, and if the reference is an
rvalue reference, the initializer expression can't be an lvalue.  And here the
result of the conversion is "const int", so the "const" would be dropped.  A
similar ill-formed test from the standard is

struct X { operator int&(); };
int&& rri = X();

where the result of the conversion is an lvalue of related type.  All the
compilers I've tried actually accept the first test, but I think that's wrong.


I don't think it is.  g++ and clang++ reject the first test if you change
int to a class type, but prvalues of scalar type have no cv-qualifiers, so
the result of the conversion is a prvalue of type int, which is a perfectly
good initializer for int&&.

This is OK for the same reason:

   int&& r = (const int)42;


Oop, this is embarassing, sorry.  So I guess we're not handling the (5.3.2)
case in [dcl.init.ref] properly: "If the converted initializer is a prvalue,
its type T4 is adjusted to type “cv1 T4” ([conv.qual]) and the temporary
materialization conversion ([conv.rval]) is applied.


Sure, though I would think it's an issue of the bullet just above:

* has a class type (i.e., T2 is a class type), where T1 is not 
reference-related to T2, and can be converted to an rvalue or function 
lvalue of type “cv3 T3”, where “cv1 T1” is reference-compatible with 
“cv3 T3” (see 11.3.1.6),


where X can be converted to an rvalue of type int, which is 
reference-compatible with int.


Jason


Re: [C++ PATCH] Add -fconstexpr-ops-limit= option (PR c++/87481)

2019-03-21 Thread Jakub Jelinek
On Thu, Mar 21, 2019 at 07:27:03PM -0400, Jason Merrill wrote:
> On 3/15/19 4:07 PM, Jakub Jelinek wrote:
> > +/* Number of cxx_eval_constant_expression calls (except skipped ones,
> > +   on simple constants or location wrappers) encountered during current
> > +   cxx_eval_outermost_constant_expr call.  */
> > +static HOST_WIDE_INT constexpr_ops_count;
> 
> Hmm, a global static variable is non-reentrant.  This may not be an issue in
> practice because of instantiate_constexpr_fns, but still seems questionable.

One option would be to add HOST_WIDE_INT *constexpr_ops_count; into
constexpr_ctx structure (pointer, not the counter itself, because we
  constexpr_ctx new_ctx = *ctx;
  ... (&new_ctx, ...)
).

Jakub


Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Jonathan Wakely

On 21/03/19 22:32 +, Jonathan Wakely wrote:

On 21/03/19 15:19 -0700, Thomas Rodgers wrote:


Fixed up change log.


Thanks, this version seems to have addressed everything.

As this was one of our big ticket features for stage 1 (but was
delayed due to the necessary legal steps that Intel took to kindly
transfer this to the LLVM project) we still want to go ahead and
commit this, even though it's late in stage 4.

There's some risk in adding this much code this late, but the
additions to existing headers are all guarded with
#if __cplusplus > 201402L
so it only affects C++17 mode (not the C++14 defaults) and will be a
no-op for most users.

So I'm going to approve this for trunk (and Thomas and I are committed
to dealing with any problems that might show up as soon as they
happen).



This has now been committed to trunk (by me, because of some git-svn
weirdness at Thomas's end and at mine, so despite our best efforts it
ended up using my login credentials).

Thanks Thomas!

And thanks Alexey and Mikhail and everyone else at Intel for making
this code available in the first place. I'm very pleased to have this
in libstdc++ in time for the next GCC release.






Re: [PATCH] Integration of parallel standard algorithms for c++17

2019-03-21 Thread Jonathan Wakely

On 21/03/19 23:56 +, Jonathan Wakely wrote:

On 21/03/19 22:32 +, Jonathan Wakely wrote:

On 21/03/19 15:19 -0700, Thomas Rodgers wrote:


Fixed up change log.


Thanks, this version seems to have addressed everything.

As this was one of our big ticket features for stage 1 (but was
delayed due to the necessary legal steps that Intel took to kindly
transfer this to the LLVM project) we still want to go ahead and
commit this, even though it's late in stage 4.

There's some risk in adding this much code this late, but the
additions to existing headers are all guarded with
#if __cplusplus > 201402L
so it only affects C++17 mode (not the C++14 defaults) and will be a
no-op for most users.

So I'm going to approve this for trunk (and Thomas and I are committed
to dealing with any problems that might show up as soon as they
happen).



This has now been committed to trunk (by me, because of some git-svn
weirdness at Thomas's end and at mine, so despite our best efforts it
ended up using my login credentials).

Thanks Thomas!

And thanks Alexey and Mikhail and everyone else at Intel for making
this code available in the first place. I'm very pleased to have this
in libstdc++ in time for the next GCC release.


I keep forgetting to add that docs for this stuff will be coming some
time next week, describing the TBB dependency that's needed to use
these parallel algos.





Re: [PATCH] rs6000: Fix typo in mmintrin.h

2019-03-21 Thread Segher Boessenkool
On Thu, Mar 21, 2019 at 04:15:09PM -0500, Bill Schmidt wrote:
> It was recently pointed out that there's a pasto in mmintrin.h for the
> _mm_sub_pi32 function, so that it performs an addition rather than a
> subtraction.  This won't do.
> 
> This patch corrects the problem, and adds a test case to verify it.
> Installed and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this ok for trunk, and backport to GCC 8?

Yes, okay for all.  Thanks!


Segher


> 2019-03-21  Bill Schmidt  
> 
>   * config/rs6000/mmintrin.h (_mm_sub_pi32): Fix typo.
> 
> [gcc/testsuite]
> 
> 2019-03-21  Bill Schmidt  
> 
>   * gcc.target/powerpc/mmx-psubd-2.c: Test _m_psubd.


Re: PING [PATCH] fix ICE in __builtin_has_attribute (PR 88383 and 89288)

2019-03-21 Thread Martin Sebor

On 3/21/19 5:05 PM, Martin Sebor wrote:

On 3/21/19 4:25 PM, Jakub Jelinek wrote:

On Thu, Mar 21, 2019 at 04:19:37PM -0600, Martin Sebor wrote:

Say that the argument is either a type or an expression that is
either an identifier (for C++ id-expression) to cover 1) and
a postfix expression with . or -> operator (to cover COMPONENT_REF)?


That doesn't look easy to understand.


Why?  Those users that don't read corresponding language standards
will just see the compiler diagnosing any uses but those 3 kinds
and can then just read the documentation which will show in examples 
what is

accepted and what it will do.


Because to most users it suggests that . and -> are the only operators
that are accepted in the expression and so something like p[0].a is not
a valid argument.

Those who don't read the manual will be puzzled when, after having had
p[0].a accepted, p->a[0] will trigger an error.


We do not want to allow INDIRECT_REF, ARRAY_REF, etc.


Why not?  What exactly is the concern?


Because only the . and -> operators are needed to get at the non-static
member declaration.  INDIRECT_REF or ARRAY_REF aren't useful for that,
and then it raises the question what exactly is supposed to be the 
behavior

when you use *expr or expr[0] or expr->field[0].  Those expression don't
have any decl, so we'd be back at your suggestion to pretty randomly
sometimes return DECL_ATTRIBUTES, sometimes TYPE_ATTRIBUTES, sometimes 
both.
That is just a bad design.  If people want type attributes, they 
should use

__typeof or decltype or just type itself in the argument.


I never suggested to "pretty randomly sometimes return
DECL_ATTRIBUTES, sometimes TYPE_ATTRIBUTES, or sometimes both."
That's a silly mischaracterization, just as it would be to say
the same thing about __alignof__, and it works much the same way.
It operates on expressions and returns the alignment either
explicitly speciified for the variable (or function) or that
specified for its type.

I modeled the built-in on __alignof__.  There may bugs where it
behaves differently, or subtle deviations where I chose a different
approach, but it's certainly not random.  I'd like to fix the bugs,
and I'm open to reconsidering the deviations, but using either as
a justification for rejecting those kinds of expressions would
result in hard-to-use API.  And _that_ would be bad design.


An example of a deviation from __alignof__ that comes to mind
is this:

  __attribute__ ((aligned (8))) int a[2];

  _Static_assert (__alignof__ (a) == 8);
  _Static_assert (__builtin_has_attribute (a, aligned));

  _Static_assert (a[0]) == 8);   // fails
  _Static_assert (__builtin_has_attribute (a[0], aligned));

I made the ARRAY_REF case the same as the VAR_DECL case because
the alignment of a whole array also implies the alignment of
the array's elements.  Ditto for attribute packed. (And with
the patch the INDIRECT_REF has the same effect.)

I'm guessing that what you're objecting to is that this decision
conflates the results of the builtin for an array declared with
an attribute with that of the array's elements, as in:

  __attribute__ ((vector_size (8))) int a[2];

  _Static_assert (__builtin_has_attribute (a, vector_size));
  _Static_assert (__builtin_has_attribute (a[0], vector_size));

where both assertions pass, even though just the array element
is a vector, while a itself is an ordinary array.  I can see
that is not quite right and might warrant reconsidering
the decision for ARRAY_REF.

To your point about using __typeof__: GCC is inconsistent about
what it applies variable and function attributes to, whether
the decl or its type.  For example, vector_size is documented
as a variable attribute but it applies to a type.  Similarly,
attributes alloc_size and malloc are both documented as
function attributes but the former applies to the function
type (and so can be applied to a pointer to function), while
the latter to the function decl (and is ignored with
a warning on pointers to functions).  Until this is cleaned
up, either by correcting the documentation, or preferably by
making these attributes uniformly apply to types, it should
be obvious that suggesting that "if people want type attributes,
they should use __typeof or decltype" does not offer a viable
solution.

Martin


Re: [PATCH] libgomp: Add master thread to thread pool

2019-03-21 Thread Kevin Buettner
Ping.

On Fri, 22 Feb 2019 18:11:44 -0700
Kevin Buettner  wrote:

> For debugging purposes, I need to be able to find the master thread
> in the thread pool.
> 
> Without this patch, I see over 20 failures in the tests that I've
> written for GDB.
> 
> I've also tested this in the gcc tree - no regressions.
> 
> libgomp/ChangeLog:
> 
>   * team.c (gomp_team_start): Initialize pool->threads[0].
> ---
>  libgomp/team.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/libgomp/team.c b/libgomp/team.c
> index 2b2e9750da5..e331c9b72c0 100644
> --- a/libgomp/team.c
> +++ b/libgomp/team.c
> @@ -477,11 +477,17 @@ gomp_team_start (void (*fn) (void *), void *data, 
> unsigned nthreads,
>make no effort to expand gomp_threads_size geometrically.  */
>if (nthreads >= pool->threads_size)
>   {
> +   bool need_init = (pool->threads == NULL);
> pool->threads_size = nthreads + 1;
> pool->threads
>   = gomp_realloc (pool->threads,
>   pool->threads_size
>   * sizeof (struct gomp_thread *));
> +   if (need_init)
> + {
> +   /* Add current (master) thread to threads[].  */
> +   pool->threads[0] = thr;
> + }
>   }
>  
>/* Release existing idle threads.  */
> 


Re: [PATCH PR81740]Enforce dependence check for outer loop vectorization

2019-03-21 Thread Bin.Cheng
On Thu, Mar 21, 2019 at 8:24 PM Richard Biener
 wrote:
>
> On Mon, Dec 18, 2017 at 1:37 PM Richard Biener
>  wrote:
> >
> > On Fri, Dec 15, 2017 at 7:39 PM, Bin.Cheng  wrote:
> > > On Fri, Dec 15, 2017 at 1:19 PM, Richard Biener
> > >  wrote:
> > >> On Fri, Dec 15, 2017 at 1:35 PM, Bin.Cheng  wrote:
> > >>> On Fri, Dec 15, 2017 at 12:09 PM, Bin.Cheng  
> > >>> wrote:
> >  On Fri, Dec 15, 2017 at 11:55 AM, Richard Biener
> >   wrote:
> > > On Fri, Dec 15, 2017 at 12:30 PM, Bin Cheng  wrote:
> > >> Hi,
> > >> As explained in the PR, given below test case:
> > >> int a[8][10] = { [2][5] = 4 }, c;
> > >>
> > >> int
> > >> main ()
> > >> {
> > >>   short b;
> > >>   int i, d;
> > >>   for (b = 4; b >= 0; b--)
> > >> for (c = 0; c <= 6; c++)
> > >>   a[c + 1][b + 2] = a[c][b + 1];
> > >>   for (i = 0; i < 8; i++)
> > >> for (d = 0; d < 10; d++)
> > >>   if (a[i][d] != (i == 3 && d == 6) * 4)
> > >> __builtin_abort ();
> > >>   return 0;
> > >>
> > >> the loop nest is illegal for vectorization without reversing inner 
> > >> loop.  The issue
> > >> is in data dependence checking of vectorizer, I believe the 
> > >> mentioned revision just
> > >> exposed this.  Previously the vectorization is skipped because of 
> > >> unsupported memory
> > >> operation.  The outer loop vectorization unrolls the outer loop into:
> > >>
> > >>   for (b = 4; b > 0; b -= 4)
> > >>   {
> > >> for (c = 0; c <= 6; c++)
> > >>   a[c + 1][6] = a[c][5];
> > >> for (c = 0; c <= 6; c++)
> > >>   a[c + 1][5] = a[c][4];
> > >> for (c = 0; c <= 6; c++)
> > >>   a[c + 1][4] = a[c][3];
> > >> for (c = 0; c <= 6; c++)
> > >>   a[c + 1][3] = a[c][2];
> > >>   }
> > >> Then four inner loops are fused into:
> > >>   for (b = 4; b > 0; b -= 4)
> > >>   {
> > >> for (c = 0; c <= 6; c++)
> > >> {
> > >>   a[c + 1][6] = a[c][5];  // S1
> > >>   a[c + 1][5] = a[c][4];  // S2
> > >>   a[c + 1][4] = a[c][3];
> > >>   a[c + 1][3] = a[c][2];
> > >> }
> > >>   }
> > >
> > > Note that they are not really "fused" but they are interleaved.  With
> > > GIMPLE in mind
> > > that makes a difference, you should get the equivalent of
> > >
> > >for (c = 0; c <= 6; c++)
> > >  {
> > >tem1 = a[c][5];
> > >tem2 = a[c][4];
> > >tem3 = a[c][3];
> > >tem4 = a[c][2];
> > >a[c+1][6] = tem1;
> > >a[c +1][5] = tem2;
> > > a[c+1][4] = tem3;
> > > a[c+1][3] = tem4;
> > >  }
> >  Yeah, I will double check if this abstract breaks the patch and how.
> > >>> Hmm, I think this doesn't break it, well at least for part of the
> > >>> analysis, because it is loop carried (backward) dependence goes wrong,
> > >>> interleaving or not with the same iteration doesn't matter here.
> > >>
> > >> I think the idea is that forward dependences are always fine (negative 
> > >> distance)
> > >> to vectorize.  But with backward dependences we have to adhere to max_vf.
> > >>
> > >> It looks like for outer loop vectorization we only look at the distances 
> > >> in the
> > >> outer loop but never at inner ones.  But here the same applies but isn't 
> > >> that
> > >> independend on the distances with respect to the outer loop?
> > >>
> > >> But maybe I'm misunderstanding how "distances" work here.
> > > Hmm, I am not sure I understand "distance" correctly.  With
> > > description as in book like "Optimizing compilers for Modern
> > > Architectures", distance is "# of iteration of sink ref - # of
> > > iteration of source ref".  Given below example:
> > >   for (i = 0; i < N; ++i)
> > > {
> > >   x = arr[idx_1];  // S1
> > >   arr[idx_2] = x;  // S2
> > > }
> > > if S1 is source ref, distance = idx_2 - idx_1, and distance > 0.  Also
> > > this is forward dependence.  For example, idx_1 is i + 1 and idx_2 is
> > > i;
> > > If S2 is source ref, distance = idx_1 - idx_2, and distance < 0.  Also
> > > this is backward dependence.  For example idx_1 is i and idx_2 is i +
> > > 1;
> > >
> > > In GCC, we always try to subtract idx_2 from idx_1 first in computing
> > > classic distance, we could result in negative distance in case of
> > > backward dependence.  When this happens at dependence carried level,
> > > we manually reverse it.  When this happens at inner level loop, we
> > > simply keep the negative distance.  And it's meaningless to talk about
> > > forward/backward given dependence is carried at outer level loop.
> > >
> > > Outer loop vectorization is interesting.  The problematic case has
> > > backward dependence carried by outer loop.  Because we don't check
> > > dist vs. max_vf for such dep, the unrolled references could have outer
> > > loop index eq