[PATCH, testsuite]: Fix gcc.dg/unroll-7.c FAIL for x32 target

2017-04-07 Thread Uros Bizjak
Hello!

Attached patch mitigates:

FAIL: gcc.dg/unroll-7.c scan-rtl-dump loop2_unroll "number of
iterations: .const_int 99"

testsuite failure for x32 target. Loop analysis determines:

Loop 1 is simple:
  simple exit 3 -> 4
  infinite if: (expr_list:REG_DEP_TRUE (ne:SI (and:SI (minus:SI
(reg:SI 106 [ _13 ])
(reg:SI 105 [ ivtmp.10 ]))
(const_int 3 [0x3]))
(const_int 0 [0]))
(nil))
  number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 106 [ _13 ])
(reg:SI 105 [ ivtmp.10 ]))
(const_int -4 [0xfffc]))
(const_int 2 [0x2]))
  upper bound: 99
  likely upper bound: 99
  realistic bound: 99

but it doesn't figure out that with (reg 105) and (reg 106), defined as:

(insn 22 19 23 2 (set (reg:SI 105 [ ivtmp.10 ])
(subreg/s/v:SI (reg/v/f:DI 110 [ a ]) 0)) 82 {*movsi_internal}
 (nil))
(insn 23 22 28 2 (parallel [
(set (reg:SI 106 [ _13 ])
(plus:SI (subreg/s/v:SI (reg/v/f:DI 110 [ a ]) 0)
(const_int 400 [0x3d0900])))
(clobber (reg:CC 17 flags))
]) 217 {*addsi_1}

(minus:SI (reg:SI 106) ( reg:SI 105)) evaluates to (const_int
400). Probably, because subregs are involved.

Short of enhancing loop analysis to properly handle subregs, I propose
following testsuite patch that avoids particularities of argument
passing.

2017-04-07  Uros Bizjak  

* gcc.dg/unroll-7.c: Declare "a" as pointer to external array.

Patch was tested on x86_64 {,-m32,-mx32}.

OK for mainline?

Uros.
diff --git a/gcc/testsuite/gcc.dg/unroll-7.c b/gcc/testsuite/gcc.dg/unroll-7.c
index 70b92ba..e76d4fa 100644
--- a/gcc/testsuite/gcc.dg/unroll-7.c
+++ b/gcc/testsuite/gcc.dg/unroll-7.c
@@ -2,7 +2,9 @@
 /* { dg-options "-O2 -fdump-rtl-loop2_unroll -funroll-loops" } */
 /* { dg-require-effective-target int32plus } */
 
-int t(int *a)
+extern int *a;
+
+int t(void)
 {
   int i;
   for (i=0;i<100;i++)


Re: [PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-07 Thread Segher Boessenkool
On Fri, Apr 07, 2017 at 08:54:01AM +0200, Eric Botcazou wrote:
> > The only straightforward way I see is to use a rs6000_store_data_bypass_p
> > instead, which would be doing the same thing.  :-(
> 
> Why not just change the type of the blockage instruction as you suggested?

That works for this case, sure, but it won't solve the underlying
problem.

The default instruction type for rs6000 is "integer".  Many instructions
have this type: some bswap, register moves, load address/immediate,
mtvrsave, nop, and many things that are split; and that is just those
that have type "*" (so are easy to search for), not all those that do
no explicitly have a type attribute (including blockage).

Now, the power6 scheduler defines a bypass from power6-integer (which is
types "integer" as well as "add" and "logical") to stores.  And then
store_data_bypass_p ICEs because not all "integer" insns are a SET or
a PARALLEL of SETs.

So, changing the type involves changing the default to something else,
changing all pipeline descriptions to do the same with that type as it
does with "integer", checking all patterns that are type "integer" to
see if they should be that new type or not.  Or we could just change
"blockage" and wait for the next bug report.

Alternatively, we can arrange for the bypass functions to not ICE.  We
can do that specific to these rs6000 pipeline descriptions, by having
our own version of store_data_bypass_p; or we can make that function
work for all insns (its definition works fine for insn pairs where
not both the producer and consumer are SETs).  That's what Kelvin's
patch does.  What is the value in ICEing here?


Segher


[PATCH] omp-low: fix lastprivate/linear lowering for SIMT

2017-04-07 Thread Alexander Monakov
Ping.

> I've noticed while re-reading that this patch incorrectly handled SIMT case
> in lower_lastprivate_clauses.  The code was changed to look for variables
> with "omp simt private" attribute, and was left under
> 'simduid && DECL_HAS_VALUE_EXPR_P (new_var)' condition.  This effectively
> constrained processing to privatized (i.e. addressable) variables, as
> non-addressable variables now have neither the value-expr nor the attribute.
> 
> This wasn't caught in testing, as apparently all testcases that have target
> simd loops with linear/lastprivate clauses also have the corresponding 
> variables
> mentioned in target map clause, which makes them addressable (is that 
> necessary?),
> and I didn't think to check something like that manually.
> 
> The following patch fixes it by adjusting the if's in 
> lower_lastprivate_clauses;
> alternatively it may be possible to keep that code as-is, and instead set up
> value-expr and "omp simt private" attributes for all formally private 
> variables,
> not just addressable ones, but to me that sounds less preferable.  OK for 
> trunk?
> 
> I think it's possible to improve test coverage for NVPTX by running all OpenMP
> testcases via nvptx-none-run (it'll need some changes, but shouldn't be hard).
> 
> gcc/
>   * omp-low.c (lower_lastprivate_clauses): Correct handling of linear and
>   lastprivate clauses in SIMT case.
> 
> libgomp/
>   * testsuite/libgomp.c/target-36.c: New testcase.
> 
> Thanks.
> Alexander
>   
> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
> index 253dc85..02b681c 100644
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -4768,11 +4768,10 @@ lower_lastprivate_clauses (tree clauses, tree 
> predicate, gimple_seq *stmt_list,
>   TREE_NO_WARNING (new_var) = 1;
>   }
>  
> -   if (simduid && DECL_HAS_VALUE_EXPR_P (new_var))
> +   if (!maybe_simt && simduid && DECL_HAS_VALUE_EXPR_P (new_var))
>   {
> tree val = DECL_VALUE_EXPR (new_var);
> -   if (!maybe_simt
> -   && TREE_CODE (val) == ARRAY_REF
> +   if (TREE_CODE (val) == ARRAY_REF
> && VAR_P (TREE_OPERAND (val, 0))
> && lookup_attribute ("omp simd array",
>  DECL_ATTRIBUTES (TREE_OPERAND (val,
> @@ -4792,26 +4791,26 @@ lower_lastprivate_clauses (tree clauses, tree 
> predicate, gimple_seq *stmt_list,
>   TREE_OPERAND (val, 0), lastlane,
>   NULL_TREE, NULL_TREE);
>   }
> -   else if (maybe_simt
> -&& VAR_P (val)
> -&& lookup_attribute ("omp simt private",
> - DECL_ATTRIBUTES (val)))
> + }
> +   else if (maybe_simt)
> + {
> +   tree val = (DECL_HAS_VALUE_EXPR_P (new_var)
> +   ? DECL_VALUE_EXPR (new_var)
> +   : new_var);
> +   if (simtlast == NULL)
>   {
> -   if (simtlast == NULL)
> - {
> -   simtlast = create_tmp_var (unsigned_type_node);
> -   gcall *g = gimple_build_call_internal
> - (IFN_GOMP_SIMT_LAST_LANE, 1, simtcond);
> -   gimple_call_set_lhs (g, simtlast);
> -   gimple_seq_add_stmt (stmt_list, g);
> - }
> -   x = build_call_expr_internal_loc
> - (UNKNOWN_LOCATION, IFN_GOMP_SIMT_XCHG_IDX,
> -  TREE_TYPE (val), 2, val, simtlast);
> -   new_var = unshare_expr (new_var);
> -   gimplify_assign (new_var, x, stmt_list);
> -   new_var = unshare_expr (new_var);
> +   simtlast = create_tmp_var (unsigned_type_node);
> +   gcall *g = gimple_build_call_internal
> + (IFN_GOMP_SIMT_LAST_LANE, 1, simtcond);
> +   gimple_call_set_lhs (g, simtlast);
> +   gimple_seq_add_stmt (stmt_list, g);
>   }
> +   x = build_call_expr_internal_loc
> + (UNKNOWN_LOCATION, IFN_GOMP_SIMT_XCHG_IDX,
> +  TREE_TYPE (val), 2, val, simtlast);
> +   new_var = unshare_expr (new_var);
> +   gimplify_assign (new_var, x, stmt_list);
> +   new_var = unshare_expr (new_var);
>   }
>  
> if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
> diff --git a/libgomp/testsuite/libgomp.c/target-36.c 
> b/libgomp/testsuite/libgomp.c/target-36.c
> new file mode 100644
> index 000..6925a2a
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/target-36.c
> @@ -0,0 +1,18 @@
> +int
> +main ()
> +{
> +  int ah, bh, n = 1024;
> +#pragma omp target map(from: ah, bh)
> +  {
> +int a, b;
> +#pragma omp simd lastprivate(b)
> +for (a = 0; a < n; a++)
> +  {
> + b = a + n + 1;
> + asm volatile ("" : "+r"(b));
> +  }
> +ah = a, bh = b;
> +  }
> +  if (ah != n || bh != 2 *

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Richard Biener
On Fri, 7 Apr 2017, Florian Weimer wrote:

> On 04/06/2017 09:16 PM, Richard Biener wrote:
> > On April 6, 2017 8:12:29 PM GMT+02:00, Bernd Edlinger
> >  wrote:
> > > But isn't the effective type changed by the assignment b[1] = 0;
> > > as described in 6.5(6):
> > > "If a value is stored into an object having no declared type through an
> > > lvalue having a type that is not a character type, then the type of the
> > > lvalue becomes the effective type of the object for that access and for
> > > subsequent accesses that do not modify the stored value."
> > 
> > Yes.  I think the example is valid.  At least GCCs memory model makes it so.
> 
> As far as I understand the standard, C does not permit changing the effective
> type of an object if it has a declared type (at least not without a union).
> If GCC supports it, that's an undocumented GCC extension.

The GCC middle-end supports it because C++ supports it and there is no
way for the C FE to tell the middle-end that this is not valid.

Richard.


Re: [PATCH] Fix PR80334

2017-04-07 Thread Rainer Orth
Hi Richard,

> On Thu, 6 Apr 2017, Rainer Orth wrote:
>
>> Hi Richard,
>> 
>> > The following patch makes sure to preserve (mis-)alignment of memory
>> > references when IVOPTs generates TARGET_MEM_REFs for them.
>> >
>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>> >
>> > Richard.
>> >
>> > 2017-04-06  Richard Biener  
>> >
>> >PR tree-optimization/80334
>> >* tree-ssa-loop-ivopts.c (rewrite_use_address): Properly
>> >preserve alignment of accesses.
>> >
>> >* g++.dg/torture/pr80334.C: New testcase.
>> 
>> the new testcase FAILs on 32-bit Solaris/SPARC:
>> 
>> +FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C   -O1  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C   -O2  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C   -O2 -flto  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C -O2 -flto -flto-partition=none (test for
>> exce
>> ss errors)
>> +FAIL: g++.dg/torture/pr80334.C -O3 -fomit-frame-pointer -funroll-loops
>> -fpeel
>> -loops -ftracer -finline-functions  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C   -O3 -g  (test for excess errors)
>> +FAIL: g++.dg/torture/pr80334.C   -Os  (test for excess errors)
>> 
>> Excess errors:
>> /vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/torture/pr80334.C:11:20:
>> warning: requested alignment 16 is larger than 8 [-Wattributes]
>
> Any suggestion how to mitigate that?  Possible solution includes
> adding { target { ! ... } } to dg-do run.

No idea.  However, according to gcc-testresults there are other
failures: s390-ibm-inux-gnu and s390x-ibm-linux-gnu so far.

This might argue against just excluding a random list of failing targets.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR80334

2017-04-07 Thread Richard Biener
On Fri, 7 Apr 2017, Rainer Orth wrote:

> Hi Richard,
> 
> > On Thu, 6 Apr 2017, Rainer Orth wrote:
> >
> >> Hi Richard,
> >> 
> >> > The following patch makes sure to preserve (mis-)alignment of memory
> >> > references when IVOPTs generates TARGET_MEM_REFs for them.
> >> >
> >> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >> >
> >> > Richard.
> >> >
> >> > 2017-04-06  Richard Biener  
> >> >
> >> >  PR tree-optimization/80334
> >> >  * tree-ssa-loop-ivopts.c (rewrite_use_address): Properly
> >> >  preserve alignment of accesses.
> >> >
> >> >  * g++.dg/torture/pr80334.C: New testcase.
> >> 
> >> the new testcase FAILs on 32-bit Solaris/SPARC:
> >> 
> >> +FAIL: g++.dg/torture/pr80334.C   -O0  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C   -O1  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C   -O2  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C   -O2 -flto  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C -O2 -flto -flto-partition=none (test for
> >> exce
> >> ss errors)
> >> +FAIL: g++.dg/torture/pr80334.C -O3 -fomit-frame-pointer -funroll-loops
> >> -fpeel
> >> -loops -ftracer -finline-functions  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C   -O3 -g  (test for excess errors)
> >> +FAIL: g++.dg/torture/pr80334.C   -Os  (test for excess errors)
> >> 
> >> Excess errors:
> >> /vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/torture/pr80334.C:11:20:
> >> warning: requested alignment 16 is larger than 8 [-Wattributes]
> >
> > Any suggestion how to mitigate that?  Possible solution includes
> > adding { target { ! ... } } to dg-do run.
> 
> No idea.  However, according to gcc-testresults there are other
> failures: s390-ibm-inux-gnu and s390x-ibm-linux-gnu so far.
> 
> This might argue against just excluding a random list of failing targets.

Hmm.  Does using __BIGGEST_ALIGNMENT__, thus

int
main()
{
  alignas(__BIGGEST_ALIGNMENT__) B b[3];
...

work for you?

Richard.


Re: [PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-07 Thread Eric Botcazou
> Or we could just change "blockage" and wait for the next bug report.

That's my suggestion, yes.

> Alternatively, we can arrange for the bypass functions to not ICE.  We
> can do that specific to these rs6000 pipeline descriptions, by having
> our own version of store_data_bypass_p; or we can make that function
> work for all insns (its definition works fine for insn pairs where
> not both the producer and consumer are SETs).  That's what Kelvin's
> patch does.  What is the value in ICEing here?

Telling the back-end writer that something may be wrong somewhere instead of 
silently accepting nonsense?  How long have all the assertions been there?

-- 
Eric Botcazou


Re: [PATCH] Fix PR80334

2017-04-07 Thread Rainer Orth
Hi Richard,

>> > Any suggestion how to mitigate that?  Possible solution includes
>> > adding { target { ! ... } } to dg-do run.
>> 
>> No idea.  However, according to gcc-testresults there are other
>> failures: s390-ibm-inux-gnu and s390x-ibm-linux-gnu so far.
>> 
>> This might argue against just excluding a random list of failing targets.
>
> Hmm.  Does using __BIGGEST_ALIGNMENT__, thus
>
> int
> main()
> {
>   alignas(__BIGGEST_ALIGNMENT__) B b[3];
> ...
>
> work for you?

it does: the test now PASSes on sparc-sun-solaris2.12 and continues to
do so in i386-pc-solaris2.12, both 32 and 64-bit.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR80334

2017-04-07 Thread Richard Biener
On Fri, 7 Apr 2017, Rainer Orth wrote:

> Hi Richard,
> 
> >> > Any suggestion how to mitigate that?  Possible solution includes
> >> > adding { target { ! ... } } to dg-do run.
> >> 
> >> No idea.  However, according to gcc-testresults there are other
> >> failures: s390-ibm-inux-gnu and s390x-ibm-linux-gnu so far.
> >> 
> >> This might argue against just excluding a random list of failing targets.
> >
> > Hmm.  Does using __BIGGEST_ALIGNMENT__, thus
> >
> > int
> > main()
> > {
> >   alignas(__BIGGEST_ALIGNMENT__) B b[3];
> > ...
> >
> > work for you?
> 
> it does: the test now PASSes on sparc-sun-solaris2.12 and continues to
> do so in i386-pc-solaris2.12, both 32 and 64-bit.

Tested on x86_64-unknwon-linux-gnu, applied.

Richard.

2017-04-07  Richard Biener  

PR tree-optimization/80334
* g++.dg/torture/pr80334.C: Use __BIGGEST_ALIGNMENT__ for
alignas on stack.

Index: gcc/testsuite/g++.dg/torture/pr80334.C
===
--- gcc/testsuite/g++.dg/torture/pr80334.C  (revision 246752)
+++ gcc/testsuite/g++.dg/torture/pr80334.C  (working copy)
@@ -8,7 +8,7 @@ char x;
 int
 main()
 {
-  alignas(16) B b[3];
+  alignas(__BIGGEST_ALIGNMENT__) B b[3];
   for (int i = 0; i < 3; i++) b[i].unpacked.c = 'a' + i;
   for (int i = 0; i < 3; i++)
 {


Re: [PATCH] PR80101: Fix ICE in store_data_bypass_p

2017-04-07 Thread Segher Boessenkool
On Fri, Apr 07, 2017 at 10:39:03AM +0200, Eric Botcazou wrote:
> > Or we could just change "blockage" and wait for the next bug report.
> 
> That's my suggestion, yes.
> 
> > Alternatively, we can arrange for the bypass functions to not ICE.  We
> > can do that specific to these rs6000 pipeline descriptions, by having
> > our own version of store_data_bypass_p; or we can make that function
> > work for all insns (its definition works fine for insn pairs where
> > not both the producer and consumer are SETs).  That's what Kelvin's
> > patch does.  What is the value in ICEing here?
> 
> Telling the back-end writer that something may be wrong somewhere instead of 
> silently accepting nonsense?

Why is it nonsense?  The predicate gives the answer to the question
"given these insns A and B, does A feed data that B stores in memory".
That is a perfectly valid question to ask of any two insns.

> How long have all the assertions been there?

There are workarounds to this problem as well: mips_store_data_bypass_p,
added in 2006.  mep_store_data_bypass_p, added in 2009 (the port has
been removed since then, of course).


Segher


[wwwdocs] Document C++ null pointer constant changes in gcc-7/porting_to.html

2017-04-07 Thread Jonathan Wakely

This issue caused a lot of build failures during the GCC mass rebuilds
for Fedora, but isn't in the porting to guide yet.

Is this accurate and clear enough for casual readers?


Index: htdocs/gcc-7/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/porting_to.html,v
retrieving revision 1.13
diff -u -r1.13 porting_to.html
--- htdocs/gcc-7/porting_to.html	6 Apr 2017 17:12:16 -	1.13
+++ htdocs/gcc-7/porting_to.html	7 Apr 2017 09:25:06 -
@@ -118,6 +118,39 @@
 with GCC 7 and some are compiled with older releases.
 
 
+Null pointer constants
+
+
+When compiling as C++11 or later, GCC 7 follows the revised definition of a
+null pointer constant. This means conversions to pointers from other
+types of constant (such as character literals and boolean literals) will now
+be rejected.
+
+
+void* f() {
+  char* p = '\0';
+  return false;
+}
+
+
+  
+error: invalid conversion from 'char' to 'char*' [-fpermissive]
+   char* p = '\0';
+ ^~~~
+error: cannot convert 'bool' to 'void*' in return
+   return false;
+  ^
+
+
+
+Such code should be fixed to use a valid null pointer constant such as
+0 or nullptr. Care should be taken when fixing
+invalid uses of '\0' as a pointer, as it may not be clear whether
+the intention was to create a null pointer (p = 0;), to create an
+empty string (p = "";), or to write a null terminator to the
+string (*p = '\0';).
+
+
 Header dependency changes
 
 


Re: [wwwdocs] Document C++ null pointer constant changes in gcc-7/porting_to.html

2017-04-07 Thread Marek Polacek
On Fri, Apr 07, 2017 at 10:25:21AM +0100, Jonathan Wakely wrote:
> This issue caused a lot of build failures during the GCC mass rebuilds
> for Fedora, but isn't in the porting to guide yet.
> 
> Is this accurate and clear enough for casual readers?

Looks fine to me.

> Index: htdocs/gcc-7/porting_to.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/porting_to.html,v
> retrieving revision 1.13
> diff -u -r1.13 porting_to.html
> --- htdocs/gcc-7/porting_to.html  6 Apr 2017 17:12:16 -   1.13
> +++ htdocs/gcc-7/porting_to.html  7 Apr 2017 09:25:06 -
> @@ -118,6 +118,39 @@
>  with GCC 7 and some are compiled with older releases.
>  
>  
> +Null pointer constants
> +
> +
> +When compiling as C++11 or later, GCC 7 follows the revised definition of a
> +null pointer constant. This means conversions to pointers from other
> +types of constant (such as character literals and boolean literals) will now
> +be rejected.
> +
> +
> +void* f() {
> +  char* p = '\0';
> +  return false;
> +}
> +
> +
> +  
> +error: invalid conversion from 'char' to 
> 'char*' [-fpermissive]
> +   char* p = '\0';
> + ^~~~
> +error: cannot convert 'bool' to 
> 'void*' in return
> +   return false;
> +  ^
> +
> +
> +
> +Such code should be fixed to use a valid null pointer constant such as
> +0 or nullptr. Care should be taken when fixing
> +invalid uses of '\0' as a pointer, as it may not be clear 
> whether
> +the intention was to create a null pointer (p = 0;), to create 
> an
> +empty string (p = "";), or to write a null terminator to the
> +string (*p = '\0';).
> +
> +
>  Header dependency changes
>  
>  


Marek


[PATCH, GCC/testsuite] Require c99_runtime for pr79800.c

2017-04-07 Thread Thomas Preudhomme

Hi,

Testcase gcc.dg/tree-ssa/pr79800.c is run unconditionally. Yet, it uses
the 'a' conversion specifier in sprintf which is only available in C99
or later version of the C standard. This commit adds a c99_runtime
requirement.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-04-07  Thomas Preud'homme  

* gcc.dg/tree-ssa/pr79800.c: Require c99_runtime.


Testing: test fails with newlib without c99 specifier support and is
marked UNSUPPORTED with this patch.

Committed as obvious.

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr79800.c b/gcc/testsuite/gcc.dg/tree-ssa/pr79800.c
index 180a6e7863141cae15958df6fff5c7410ce9e39f..030e276698d105dd1ab1a7b775fb5fef48e02c59 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr79800.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr79800.c
@@ -1,6 +1,6 @@
 /* PR 79800 - wrong snprintf result range with precision in a narrow
negative-positive range
-   { dg-do "run" }
+   { dg-do "run" { target c99_runtime } }
{ dg-options "-O2 -Wall" } */
 
 #define FMT "%.*a"


Re: [PATCH] Fix PR80029

2017-04-07 Thread Thomas Schwinge
Hi Cesar!

On Wed, 22 Mar 2017 06:40:03 -0700, Cesar Philippidis  
wrote:
> In addition to resolving the memory leak involving OpenACC delare
> clauses, this patch also corrects an ICE involving VLA variables as data
> clause arguments used in acc declare constructs. More details can be
> found here .
> 
> Is this OK for trunk?

(Got committed in r246381.)

>   PR c++/80029
> 
>   gcc/
>   * gimplify.c (is_oacc_declared): New function.
>   (oacc_default_clause): Use it to set default flags for acc declared
>   variables inside parallel regions.
>   (gimplify_scan_omp_clauses): Strip firstprivate pointers for acc
>   declared variables.
>   (gimplify_oacc_declare): Gimplify the declare clauses.  Add the
>   declare attribute to any decl as necessary.

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -6787,6 +6787,16 @@ device_resident_p (tree decl)
>return false;
>  }
>  
> +/* Return true if DECL has an ACC DECLARE attribute.  */
> +
> +static bool
> +is_oacc_declared (tree decl)
> +{

Adding here a "gcc_assert (TREE_CODE (decl) != MEM_REF);", that one never
triggers (at least given the current test coverage), so...

> +  tree t = TREE_CODE (decl) == MEM_REF ? TREE_OPERAND (decl, 0) : decl;

... it would seem that this code can be simplified -- or under which
circumstances should we get a MEM_REF here?  The call from
gimplify_oacc_declare itself does the tree operand 0 indirection, and the
call from oacc_default_clause shouldn't be a MEM_REF, I would think.

> +  tree declared = lookup_attribute ("oacc declare target", DECL_ATTRIBUTES 
> (t));
> +  return declared != NULL_TREE;
> +}

> @@ -6887,6 +6897,7 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>  {
>const char *rkind;
>bool on_device = false;
> +  bool declared = is_oacc_declared (decl);
>tree type = TREE_TYPE (decl);
>  
>if (lang_hooks.decls.omp_privatize_by_reference (decl))
> @@ -6917,7 +6928,7 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
> decl, unsigned flags)
>  
>  case ORT_ACC_PARALLEL:
>{
> - if (on_device || AGGREGATE_TYPE_P (type))
> + if (on_device || AGGREGATE_TYPE_P (type) || declared)
> /* Aggregates default to 'present_or_copy'.  */
> flags |= GOVD_MAP;
>   else

Isn't this new "declared" logic doing a thing very similar to the
existing "on_device" logic?  Should that be combined/cleaned up?

Why doesn't the same "declared"/"on_device" logic also apply to
ORT_ACC_KERNELS?


Grüße
 Thomas


> @@ -7346,6 +7357,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
> *pre_p,
>case OMP_TARGET_DATA:
>case OMP_TARGET_ENTER_DATA:
>case OMP_TARGET_EXIT_DATA:
> +  case OACC_DECLARE:
>case OACC_HOST_DATA:
>   ctx->target_firstprivatize_array_bases = true;
>default:
> @@ -9231,18 +9243,26 @@ gimplify_oacc_declare (tree *expr_p, gimple_seq 
> *pre_p)
>  {
>tree expr = *expr_p;
>gomp_target *stmt;
> -  tree clauses, t;
> +  tree clauses, t, decl;
>  
>clauses = OACC_DECLARE_CLAUSES (expr);
>  
>gimplify_scan_omp_clauses (&clauses, pre_p, ORT_TARGET_DATA, OACC_DECLARE);
> +  gimplify_adjust_omp_clauses (pre_p, NULL, &clauses, OACC_DECLARE);
>  
>for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t))
>  {
> -  tree decl = OMP_CLAUSE_DECL (t);
> +  decl = OMP_CLAUSE_DECL (t);
>  
>if (TREE_CODE (decl) == MEM_REF)
> - continue;
> + decl = TREE_OPERAND (decl, 0);
> +
> +  if (VAR_P (decl) && !is_oacc_declared (decl))
> + {
> +   tree attr = get_identifier ("oacc declare target");
> +   DECL_ATTRIBUTES (decl) = tree_cons (attr, NULL_TREE,
> +   DECL_ATTRIBUTES (decl));
> + }
>  
>if (VAR_P (decl)
> && !is_global_var (decl)
> @@ -9258,7 +9278,8 @@ gimplify_oacc_declare (tree *expr_p, gimple_seq *pre_p)
>   }
>   }
>  
> -  omp_add_variable (gimplify_omp_ctxp, decl, GOVD_SEEN);
> +  if (gimplify_omp_ctxp)
> + omp_add_variable (gimplify_omp_ctxp, decl, GOVD_SEEN);
>  }
>  
>stmt = gimple_build_omp_target (NULL, GF_OMP_TARGET_KIND_OACC_DECLARE,
> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-vla.c 
> b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-vla.c
> new file mode 100644
> index 000..3ea148e
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-vla.c
> @@ -0,0 +1,25 @@
> +/* Verify that acc declare accept VLA variables.  */
> +
> +#include 
> +
> +int
> +main ()
> +{
> +  int N = 1000;
> +  int i, A[N];
> +#pragma acc declare copy(A)
> +
> +  for (i = 0; i < N; i++)
> +A[i] = -i;
> +
> +#pragma acc kernels
> +  for (i = 0; i < N; i++)
> +A[i] = i;
> +
> +#pragma acc update host(A)
> +
> +  for (i = 0; i < N; i++)
> +assert (A[i] == i);
> +
> +  return 0;
> +}


Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-04-07 Thread Martin Liška
On 04/06/2017 06:44 PM, Jeff Law wrote:
> On 03/24/2017 03:29 AM, Martin Liška wrote:
>> I would like to ping that. I'm not sure what's agreement after I read
>> discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html
>>
>> Martin Sebor may know, CC'ing him.
> Not sure if you're pinging the internal_error_cont stuff, or the ODR 
> diagnostics changes.
> 
> WRT the ODR diagnostics, I'd say let's go with the C++17 style (all-lowercase 
> with a hyphen).
> 
> If you've got a pointer to the internal_err_cont changes, please pass it 
> along.

Yep, I was interested in internal_err_cont. It's next stage1 material, but I'm 
willing to have
a feedback whether separation of internal messages is desired to be excluded 
from translation or not?

Martin

> 
> jeff
> 



[PATCH] Add function part to a same comdat group (PR ipa/80212).

2017-04-07 Thread Martin Liška
Hello.

Following patch is pre-approved by Honza, where IPA split should properly set
a comdat gruop if an original function lives in any.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin
>From b876a766c2e5ffcf78fdc56e2ff5b87c1e99d30b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 6 Apr 2017 16:31:13 +0200
Subject: [PATCH] Add function part to a same comdat group (PR ipa/80212).

gcc/testsuite/ChangeLog:

2017-04-06  Martin Liska  

	PR ipa/80212
	* g++.dg/ipa/pr80212.C: New test.

gcc/ChangeLog:

2017-04-06  Martin Liska  

	PR ipa/80212
	* ipa-split.c (split_function): Add function part to a same comdat
	group.
---
 gcc/ipa-split.c|  3 +++
 gcc/testsuite/g++.dg/ipa/pr80212.C | 18 ++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr80212.C

diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index da3c2c62344..ae1de6f1e63 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -1363,6 +1363,9 @@ split_function (basic_block return_bb, struct split_point *split_point,
   /* Let's take a time profile for splitted function.  */
   node->tp_first_run = cur_node->tp_first_run + 1;
 
+  if (cur_node->same_comdat_group)
+node->add_to_same_comdat_group (cur_node);
+
   /* For usual cloning it is enough to clear builtin only when signature
  changes.  For partial inlining we however can not expect the part
  of builtin implementation to have same semantic as the whole.  */
diff --git a/gcc/testsuite/g++.dg/ipa/pr80212.C b/gcc/testsuite/g++.dg/ipa/pr80212.C
new file mode 100644
index 000..60d3b613035
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr80212.C
@@ -0,0 +1,18 @@
+// PR ipa/80212
+// { dg-options "-O2 --param partial-inlining-entry-probability=403796683 -fno-early-inlining" }
+
+struct b
+{
+  virtual b *c () const;
+};
+struct d : virtual b
+{
+};
+struct e : d
+{
+  e *
+  c () const
+  {
+  }
+};
+main () { e a; }
-- 
2.12.2



Re: [PATCH 2/3] Error message on target attribute on aarch64 target (PR target/79889).

2017-04-07 Thread James Greenhalgh
On Mon, Apr 03, 2017 at 09:26:16AM +0200, Martin Liška wrote:
> Thanks Kyrill for review. I'm pinging the patch.

OK.

Thanks,
James

> 
> Martin
> 
> On 03/21/2017 10:24 AM, Kyrill Tkachov wrote:
> > 
> > 
> > Hi Martin,
> > 
> > On 13/03/17 08:25, marxin wrote:
> >> gcc/testsuite/ChangeLog:
> >>
> >> 2017-03-13  Martin Liska  
> >>
> >> * g++.dg/ext/mv8.C: Add aarch64* targets.
> >>
> >> gcc/ChangeLog:
> >>
> >> 2017-03-13  Martin Liska  
> >>
> >> * config/aarch64/aarch64.c (aarch64_process_target_attr):
> >> Show error message instead of an ICE.
> > 
 


[PATCH] Define std::to_chars and std::from_chars for C++17 (P0067R5, partial)

2017-04-07 Thread Jonathan Wakely

This adds another piece of the C++17 library, the std::to_chars and
std::from_chars functions for converting numbers to/from strings. This
only adds the integer support, floating point types will require a lot
more work.

This has only been lightly optimised, so it beats printf on average,
but there's probably more opportunity for improvement. I didn't
investigate whether doing all work with unsigned long long would be
faster, instead of using function templates specialized for unsigned
int, unsigned long and unsigned long long. It should help code size,
but I don't know if it would be faster. I also didn't investigate
whether doing from_chars in two stages would be better, i.e. finding
all the characters that are valid digits in the given base first, and
then converting them to an integer in a second step. I'm sure there's
lots of tuning that could be done, but I hope this is good enough to
start with.

Although the new functions are only defined by  for C++17,
the implementation doesn't require C++17. so directly including
 makes them available in C++14. Non-portably, of
course.

* include/Makefile.am: Add .
* include/Makefile.in: Regenerate.
* include/bits/string_conv.h: New file.
(to_chars_result, to_chars, from_chars_result, from_chars): Define.
* include/std/utility: Include .
* testsuite/20_util/from_chars/1.cc: New test.
* testsuite/20_util/from_chars/1_neg.cc: New test.
* testsuite/20_util/from_chars/2.cc: New test.
* testsuite/20_util/from_chars/requirements.cc: New test.
* testsuite/20_util/to_chars/1.cc: New test.
* testsuite/20_util/to_chars/1_neg.cc: New test.
* testsuite/20_util/to_chars/2.cc: New test.
* testsuite/20_util/to_chars/requirements.cc: New test.

Any suggestions for immediate improvement?

Any reason not to commit this to trunk?


commit ac6a4efed8de8f4b397ca27bee8d88123ca05d22
Author: Jonathan Wakely 
Date:   Fri Apr 7 12:11:12 2017 +0100

Define std::to_chars and std::from_chars for C++17 (P0067R5, partial)

	* include/Makefile.am: Add .
	* include/Makefile.in: Regenerate.
	* include/bits/string_conv.h: New file.
	(to_chars_result, to_chars, from_chars_result, from_chars): Define.
	* include/std/utility: Include .
	* testsuite/20_util/from_chars/1.cc: New test.
	* testsuite/20_util/from_chars/1_neg.cc: New test.
	* testsuite/20_util/from_chars/2.cc: New test.
	* testsuite/20_util/from_chars/requirements.cc: New test.
	* testsuite/20_util/to_chars/1.cc: New test.
	* testsuite/20_util/to_chars/1_neg.cc: New test.
	* testsuite/20_util/to_chars/2.cc: New test.
	* testsuite/20_util/to_chars/requirements.cc: New test.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 3703bd1..c8c073c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -190,6 +190,7 @@ bits_headers = \
 	${bits_srcdir}/streambuf_iterator.h \
 	${bits_srcdir}/streambuf.tcc \
 	${bits_srcdir}/stringfwd.h \
+	${bits_srcdir}/string_conv.h \
 	${bits_srcdir}/string_view.tcc \
 	${bits_srcdir}/uniform_int_dist.h \
 	${bits_srcdir}/unique_ptr.h \
diff --git a/libstdc++-v3/include/bits/string_conv.h b/libstdc++-v3/include/bits/string_conv.h
new file mode 100644
index 000..330c0a8
--- /dev/null
+++ b/libstdc++-v3/include/bits/string_conv.h
@@ -0,0 +1,420 @@
+// Implementation of std::to_chars and std::from_chars -*- C++ -*-
+
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/bits/string_conv.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{utility}
+ */
+
+#ifndef _GLIBCXX_STRING_CONV_H
+#define _GLIBCXX_STRING_CONV_H 1
+
+#pragma GCC system_header
+
+#if __cplusplus < 201402L
+# include 
+#else
+
+#include 
+#include 
+#include 
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_V

[PATCH] Speedup gt_ggc_m_S

2017-04-07 Thread Richard Biener

Currently it looks up the page table entry for the string twice, once
in the ggc_allocated_p test (for strings allocated by libcpp?) and
once for the final lookup.

The following simply turns ggc_allocated_p into a 
safe_lookup_page_table_entry.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for GCC 8.

Richard.

2017-04-07  Richard Biener  

* ggc-page.c (ggc_allocated_p): Rename to ...
(safe_lookup_page_table_entry): ... this and return the lookup
result.
(gt_ggc_m_S): Use safe_lookup_page_table_entry.

Index: gcc/ggc-page.c
===
--- gcc/ggc-page.c  (revision 246758)
+++ gcc/ggc-page.c  (working copy)
@@ -522,7 +522,6 @@ static bool in_gc = false;
 /* Initial guess as to how many page table entries we might need.  */
 #define INITIAL_PTE_COUNT 128
 
-static int ggc_allocated_p (const void *);
 static page_entry *lookup_page_table_entry (const void *);
 static void set_page_table_entry (void *, page_entry *);
 #ifdef USING_MMAP
@@ -587,10 +586,11 @@ push_by_depth (page_entry *p, unsigned l
 #define save_in_use_p(__p) \
   (save_in_use_p_i (__p->index_by_depth))
 
-/* Returns nonzero if P was allocated in GC'able memory.  */
+/* Traverse the page table and find the entry for a page.
+   If the object wasn't allocated in GC return NULL.  */
 
-static inline int
-ggc_allocated_p (const void *p)
+static inline page_entry *
+safe_lookup_page_table_entry (const void *p)
 {
   page_entry ***base;
   size_t L1, L2;
@@ -603,7 +603,7 @@ ggc_allocated_p (const void *p)
   while (1)
 {
   if (table == NULL)
-   return 0;
+   return NULL;
   if (table->high_bits == high_bits)
break;
   table = table->next;
@@ -614,8 +614,10 @@ ggc_allocated_p (const void *p)
   /* Extract the level 1 and 2 indices.  */
   L1 = LOOKUP_L1 (p);
   L2 = LOOKUP_L2 (p);
+  if (! base[L1])
+return NULL;
 
-  return base[L1] && base[L1][L2];
+  return base[L1][L2];
 }
 
 /* Traverse the page table and find the entry for a page.
@@ -1455,12 +1457,14 @@ gt_ggc_m_S (const void *p)
   unsigned long mask;
   unsigned long offset;
 
-  if (!p || !ggc_allocated_p (p))
+  if (!p)
 return;
 
-  /* Look up the page on which the object is alloced.  .  */
-  entry = lookup_page_table_entry (p);
-  gcc_assert (entry);
+  /* Look up the page on which the object is alloced.  If it was not
+ GC allocated, gracefully bail out.  */
+  entry = safe_lookup_page_table_entry (p);
+  if (!entry)
+return;
 
   /* Calculate the index of the object on the page; this is its bit
  position in the in_use_p bitmap.  Note that because a char* might


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Bernd Edlinger
On 04/07/17 08:47, Richard Biener wrote:
> On Thu, 6 Apr 2017, Bernd Edlinger wrote:
>
>> On 04/06/17 21:05, Florian Weimer wrote:
>>> On 04/06/2017 08:49 PM, Bernd Edlinger wrote:
>>>
 For instance how do you "declare an object without a declared type"?
>>>
>>> malloc and other allocation functions return pointers to objects without
>>> a declared type.
>>>
>>
>> Thanks Florian,
>>
>> this discussion is very helpful.
>>
>> How about this for the documentation:
>>
>> @item typeless_storage
>> @cindex @code{typeless_storage} type attribute
>> In the context of section 6.5 paragraph 6 of the C11 standard,
>> an object of this type behaves as if it has no declared type.
>> In the context of section 6.5 paragraph 7 of the C11 standard,
>> an object or a pointer if this type behaves as if it were a
>> character type.
>> This is attribute is similar to the @code{may_alias} attribute,
>> except that it is not restricted to pointers.
>>
>> Example of use:
>>
>> @smallexample
>> typedef int __attribute__((__typeless_storage__)) int_a;
>>
>> int
>> main (void)
>> @{
>>int_a a = 0x12345678;
>>short *b = (short *) &a;
>>
>>b[1] = 0;
>>
>>if (a == 0x12345678)
>>  abort();
>>
>>exit(0);
>> @}
>> @end smallexample
>
> Seriously, do not suggest such broken case.  There's a union to
> do this example portably.
>

Well, it is just a mod of the other example above in
the documentation of may_alias:

typedef short __attribute__((__may_alias__)) short_a;

int
main (void)
@{
   int a = 0x12345678;
   short_a *b = (short_a *) &a;

   b[1] = 0;

   if (a == 0x12345678)
 abort();

   exit(0);
@}
@end smallexample

I just moved the attribute from "b" to "a", and that
is what the C++ people want to do as well, just they
call it a "class int_a { std::byte x[4]; }".

I personally like to have the symmetry between the
two concepts here, because it helps to understand the
differences.


Bernd.

> Richard.
>


Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Bernd Edlinger
On 04/07/17 08:54, Richard Biener wrote:
> On Thu, 6 Apr 2017, Bernd Edlinger wrote:
>> I think get_alias_set(t) will return 0 for typeless_storage
>> types, and therefore has_zero_child will be set anyway.
>> I think both mean the same thing in the end, but it depends on
>> what typeless_storage should actually mean, and we have
>> not yet the same idea about it.
>
> But has_zero_child does not do what we like it to because otherwise
> in the PR using the char[] array member would have worked!
>
> has_zero_child doesn't do that on purpose of course, but this means
> returing alias-set zero for the typeless storage _member_ doesn't
> suffice.
>

I see you have a certain idea how to solve the C++17 issue.
And yes, I apologize, if I tried to pee on your tree :)

What you propose is I think the following:
The C++ FE sets TYPE_TYPELESS_STORAGE a std::byte
and on "unsigned char" if the language dialect is cxx17
and the TBAA makes all the rest.

What I propose is as follows:
The TYPE_TYPELESS_STORAGE is a generic attribute, it
can be set on any type, and in the TBAA the attribute
does not squirrel around at all.  If it is on a type,
then all DECLs with this type get the alias set 0.
If it is on a member of a struct that does not mean
more than if the struct has a char member this it
sets has_zero_child, which I do not want to mean
anything else than before.

The C++ FE does the business logic here, in deciding
where to distribute the TYPE_TYPELESS_STORAGE flags.

in this example
class A {
   class B {
 std::byte x[5];
   } b;
};

std::byte, class B, and class A would get the
TYPE_TYPELESS_STORAGE flag set by the C++FE if
the language dialect is cxx17 or above,
so that you can place anything into any object
of class A and class B, and of type std::byte.

but in this example
class B {
   std::byte x;
};

only std::byte would get the TYPE_TYPELESS_STORAGE
flag, so you can not put anyting into an object
of class B, just on an object of std::byte.



>>
>> I wanted to be able to declare a int __attribute__((typeless_storage))
>> as in the test case, and the sample in the spec.  And that
>> information is not in the TYPE_MAIN_VARIANT.  Therefore I look for
>> typeless_storage before "t = TYPE_MAIN_VARIANT (t)".
>
> As I said I believe this is a useless feature.  If you want something
> typeless then the underlying type doesn't matter so we can as well
> force it to be an array of char.  Makes our live simpler.  And
> even makes the code portable to compilers that treat arrays of char
> conservatively.
>

I just learned that the C11 standard does not guarantee that, and also
an array of char does not provide the necessary alignment per se, at
least without alignment attributes.

>>
>> See cxx_type_contains_byte_buffer: this function looks recursively into
>> structures and unions, and returns the information if the beast
>> contains an array of unsigned char or std::byte.
>
> But with a properly designed middle-end feature that's not needed.
>
> There's technically no reason to pessimize TBAA for anything but
> the typeless storage member of a structure.
>

Yes, it is just a matter of taste.  And if you want the middle
end to be flexible here or if everything should work without user
intervention.


>>>
>>> @@ -1491,6 +1491,7 @@ struct GTY(()) tree_type_common {
>>>unsigned needs_constructing_flag : 1;
>>>unsigned transparent_aggr_flag : 1;
>>>unsigned restrict_flag : 1;
>>> +  unsigned typeless_storage_flag : 1;
>>>unsigned contains_placeholder_bits : 2;
>>>
>>>ENUM_BITFIELD(machine_mode) mode : 8;
>>>
>>> bits are grouped in groups of 8 bits, this breaks it.
>>>
>>
>> Oh..., does this explain the problems that I had with this version???
>
> No, just "cosmetics".
>
>>> @@ -8041,7 +8041,8 @@ build_pointer_type_for_mode (tree to_type, machine
>>>
>>>/* If the pointed-to type has the may_alias attribute set, force
>>>   a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  */
>>> -  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
>>> +  if (TYPE_TYPELESS_STORAGE (to_type)
>>> +  || lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
>>>  can_alias_all = true;
>>>
>>>/* In some cases, languages will have things that aren't a POINTER_TYPE
>>> @@ -8110,7 +8111,8 @@ build_reference_type_for_mode (tree to_type, machi
>>>
>>>/* If the pointed-to type has the may_alias attribute set, force
>>>   a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  */
>>> -  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
>>> +  if (TYPE_TYPELESS_STORAGE (to_type)
>>> +  || lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
>>>  can_alias_all = true;
>>>
>>>/* In some cases, languages will have things that aren't a
>>>
>>> not needed.
>>>
>>
>> You mean, because the get_alias_set (to_type) will be 0 anyways,
>> and can_alias_all wont change the semantic?
>
> Well, typeless_storage and may_alias are something different.  If
> you require the ab

Re: [PATCH] Destroy arguments for _Cilk_spawn calling in the child (PR 80038)

2017-04-07 Thread Xi Ruoyao
On 2017-04-06 11:12 -0600, Jeff Law wrote:

> With the likely deprecation in mind, I've only done a cursory review of 
> the changes -- mostly to verify that they hit Cilk+ paths only.

> What's the purpose behind changing when we set the in_lto_p flag?

Without that change, GCC with my patch ICEed with _Cilk_spawn and
-flto -O3 -fcilkplus since __cilkrts_stack_frame.ctx's type (array of void *)
was not TYPE_STRUCTURAL_EQUALITY_P in lto stage.

If this change is not proper, I'll work on modifying my patch to work
without touching in_lto_p.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: C++ PATCH to fix bogus maybe-uninitialized warning (PR c++/80119)

2017-04-07 Thread Jason Merrill
On Tue, Mar 21, 2017 at 6:00 PM, Marek Polacek  wrote:
> On Tue, Mar 21, 2017 at 08:41:01PM +0100, Jakub Jelinek wrote:
>> On Tue, Mar 21, 2017 at 03:27:02PM -0400, Jason Merrill wrote:
>> > OK.
>> >
>> > On Tue, Mar 21, 2017 at 11:38 AM, Marek Polacek  wrote:
>> > > This patch fixes a bogus maybe-uninitialized warning reported in the PR.
>> > > The issue is that we're not able to fold away useless 
>> > > CLEANUP_POINT_EXPRs,
>> > > as e.g. in
>> > >   if (<>)
>> > >// bogus warning
>> > > Here, the cleanup_point was built as <>,
>> > > which cp_fold_r reduces to <>, but leaves it as that and
>> > > passes it to the gimplifier.
>> > >
>> > > Jakub suggested handling this in cp_fold.  fold_build_cleanup_point_expr 
>> > > says
>> > > that "if the expression does not have side effects then we don't have to 
>> > > wrap
>> > > it with a cleanup point expression", so I think the following should be 
>> > > safe.
>> > >
>> > > Bootstrapped/regtested on x86_64-linux, ok for trunk?
>> > >
>> > > 2017-03-21  Marek Polacek  
>> > >
>> > > PR c++/80119
>> > > * cp-gimplify.c (cp_fold): Strip CLEANUP_POINT_EXPR if the 
>> > > expression
>> > > doesn't have side effects.
>> > >
>> > > * g++.dg/warn/Wuninitialized-9.C: New test.
>> > >
>> > > diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
>> > > index ebb5da9..b4319ca 100644
>> > > --- gcc/cp/cp-gimplify.c
>> > > +++ gcc/cp/cp-gimplify.c
>> > > @@ -2056,6 +2056,14 @@ cp_fold (tree x)
>> > >code = TREE_CODE (x);
>> > >switch (code)
>> > >  {
>> > > +case CLEANUP_POINT_EXPR:
>> > > +  /* Strip CLEANUP_POINT_EXPR if the expression doesn't have side
>> > > +effects.  */
>> > > +  r = cp_fold (TREE_OPERAND (x, 0));
>>
>> Can CLEANUP_POINT_EXPR be an lvalue?  If not, maybe cp_fold_rvalue instead?
>
> I ran the testing with some lvalue_p checks added and it seems a 
> CLEANUP_POINT_EXPR
> is never an lvalue, so I guess we can call cp_fold_rvalue.  Jason, is this 
> still
> ok?

Yes.


Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-07 Thread Dominik Vogt
On Wed, Apr 05, 2017 at 02:52:00PM +0100, Dominik Vogt wrote:
> On Mon, Mar 27, 2017 at 09:27:35PM +0100, Dominik Vogt wrote:
> > The attached patch optimizes the atomic_exchange and
> > atomic_compare patterns on s390 and s390x (mostly limited to
> > SImode and DImode).  Among general optimizaation, the changes fix
> > most of the problems reported in PR 80080:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080
> > 
> > Bootstrapped and regression tested on a zEC12 with s390 and s390x
> > biarch.

New version attached.

v4:

  * Remoce CCZZ1 iterator. 
  * Remove duplicates of CS patterns. 
  * Move the skip_cs_label so that output is moved to vtarget even 
if the CS instruction was not used. 
  * Removed leftover from "sne" (from an earlier version of the
  * patch). 

Bootstrapped and regression tested on a zEC12 with s390 and s390x
biarch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog-dv-atomic-gcc7

* s390-protos.h (s390_expand_cs_hqi): Removed.
(s390_expand_cs, s390_expand_atomic_exchange_tdsi): New prototypes.
* config/s390/s390.c (s390_emit_compare_and_swap): Handle all integer
modes as well as CCZ1mode and CCZmode.
(s390_expand_atomic_exchange_tdsi, s390_expand_atomic): Adapt to new
signature of s390_emit_compare_and_swap.
(s390_expand_cs_hqi): Likewise, make static.
(s390_expand_cs_tdsi): Generate an explicit compare before trying
compare-and-swap, in some cases.
(s390_expand_cs): Wrapper function.
(s390_expand_atomic_exchange_tdsi): New backend specific expander for
atomic_exchange.
(s390_match_ccmode_set): Allow CCZmode <-> CCZ1 mode.
* config/s390/s390.md (define_peephole2): New peephole to help
combining the load-and-test pattern with volatile memory.
("cstorecc4"): Use load-on-condition and deal with CCZmode for
TARGET_Z196.
("atomic_compare_and_swap"): Merge the patterns for small and
large integers.  Forbid symref memory operands.  Move expander to
s390.c.  Require cc register.
("atomic_compare_and_swap_internal")
("*atomic_compare_and_swap_1")
("*atomic_compare_and_swapdi_2")
("*atomic_compare_and_swapsi_3"): Use s_operand to forbid
symref memory operands.  Remove CC mode and call s390_match_ccmode
instead.
("atomic_exchange"): Allow and implement all integer modes.
gcc/testsuite/ChangeLog-dv-atomic-gcc7

* gcc.target/s390/md/atomic_compare_exchange-1.c: New test.
* gcc.target/s390/md/atomic_compare_exchange-1.inc: New test.
* gcc.target/s390/md/atomic_exchange-1.inc: New test.
>From 0dbcab9152b3d1b7c3a6e72f6d45b8eb56ab40ae Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 23 Feb 2017 17:23:11 +0100
Subject: [PATCH] S/390: Optimize atomic_compare_exchange and
 atomic_compare builtins.

1) Use the load-and-test instructions for atomic_exchange if the value is 0.
2) If IS_WEAK is true, compare the memory contents before a compare-and-swap
   and skip the CS instructions if the value is not the expected one.
---
 gcc/config/s390/s390-protos.h  |   4 +-
 gcc/config/s390/s390.c | 171 ++-
 gcc/config/s390/s390.md| 150 +
 .../gcc.target/s390/md/atomic_compare_exchange-1.c |  84 ++
 .../s390/md/atomic_compare_exchange-1.inc  | 336 +
 .../gcc.target/s390/md/atomic_exchange-1.c | 309 +++
 6 files changed, 977 insertions(+), 77 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/atomic_compare_exchange-1.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/md/atomic_compare_exchange-1.inc
 create mode 100644 gcc/testsuite/gcc.target/s390/md/atomic_exchange-1.c

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 7f06a20..3fdb320 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -112,8 +112,8 @@ extern void s390_expand_vec_strlen (rtx, rtx, rtx);
 extern void s390_expand_vec_movstr (rtx, rtx, rtx);
 extern bool s390_expand_addcc (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 extern bool s390_expand_insv (rtx, rtx, rtx, rtx);
-extern void s390_expand_cs_hqi (machine_mode, rtx, rtx, rtx,
-   rtx, rtx, bool);
+extern void s390_expand_cs (machine_mode, rtx, rtx, rtx, rtx, rtx, bool);
+extern void s390_expand_atomic_exchange_tdsi (rtx, rtx, rtx);
 extern void s390_expand_atomic (machine_mode, enum rtx_code,
rtx, rtx, rtx, bool);
 extern void s390_expand_tbegin (rtx, rtx, rtx, bool);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2cb8947..fe16647 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -1306,6 +1306,7 @@ s390_match_ccmode_set (rtx set, machine_mode req_mode)
   set_mode = GET_MODE (SET_DEST (set));
   switch (set_

[PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR sanitizer/80350).

2017-04-07 Thread Martin Liška
Hello.

Similar to what was done in Marek's r202113, when op1 is a SAVE_EXPR it must
be evaluated before condition, in order to be able to deliver the operand
to real shifting. And not just to a BB where ubsan report function is called.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
Apart from that make check RUNTESTFLAGS="ubsan.exp" works on x86_64-linux-gnu.

Ready to be installed?
Martin
>From 2ff2e17d82ee85b09cb5f83afbee70f8b1a84f4f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 7 Apr 2017 12:21:44 +0200
Subject: [PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR
 sanitizer/80350).

gcc/c-family/ChangeLog:

2017-04-07  Martin Liska  

	PR sanitizer/80350
	* c-ubsan.c (ubsan_instrument_shift): Evaluate RHS before
	doing an UBSAN check.

gcc/testsuite/ChangeLog:

2017-04-07  Martin Liska  

	PR sanitizer/80350
	* c-c++-common/ubsan/pr80350.c: New test.
---
 gcc/c-family/c-ubsan.c |  4 +++-
 gcc/testsuite/c-c++-common/ubsan/pr80350.c | 17 +
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/pr80350.c

diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
index 91bdef88320..ef45abdd19e 100644
--- a/gcc/c-family/c-ubsan.c
+++ b/gcc/c-family/c-ubsan.c
@@ -171,7 +171,9 @@ ubsan_instrument_shift (location_t loc, enum tree_code code,
 
   /* In case we have a SAVE_EXPR in a conditional context, we need to
  make sure it gets evaluated before the condition.  */
-  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
+  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t),
+		   fold_build2 (COMPOUND_EXPR, TREE_TYPE (op1),
+unshare_expr (op0), unshare_expr (op1)), t);
 
   enum sanitize_code recover_kind = SANITIZE_SHIFT_EXPONENT;
   tree else_t = void_node;
diff --git a/gcc/testsuite/c-c++-common/ubsan/pr80350.c b/gcc/testsuite/c-c++-common/ubsan/pr80350.c
new file mode 100644
index 000..317d2facaf7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/pr80350.c
@@ -0,0 +1,17 @@
+/* PR target/80310 */
+/* { dg-do run } */
+/* { dg-options "-O0 -fsanitize=shift -fno-sanitize-recover=shift" } */
+
+unsigned int x = 1;
+unsigned int y = 0;
+
+void foo() {
+  y = 1 >> (!x * );
+}
+
+int main () {
+foo ();
+if (y != 1)
+  __builtin_abort ();
+return 0;
+}
-- 
2.12.2



[PATCH] Fix iq2000-elf building libgcc

2017-04-07 Thread Jeff Law


The iq2000-elf port will not currently build libgcc due to triggering 
this assert in dwarf2out.c::dwarf2out_var_location


 gcc_assert (prev
  && (CALL_P (prev)
  || (NONJUMP_INSN_P (prev)
  && GET_CODE (PATTERN (prev)) == SEQUENCE
  && CALL_P (XVECEXP (PATTERN (prev), 0, 0);


We've seen a NOTE_INSN_CALL_ARG_LOCATION and we expect the previous real 
insn to be an actual CALL_INSN (or CALL_INSN with a delay slot wrapped 
in a SEQUENCE).


The problem is the iq2000 defines its own final_prescan_insn which emits 
a NOP insn between the CALL and the NOTE_INSN_CALL_ARG_LOCATION, 
breaking the assumptions for NOTE_INSN_CALL_ARG_LOCATION.


This patch ensures that the iq2000 does not split up the CALL and note 
by emitting the NOP after the note.


With this patch the iq2000 port will build libgcc as well as newlib 
successfully.


Given the port is broken badly enough to not build libgcc, this is 
almost certainly a regression relative to prior releases.


Installing on the trunk.

JEff
commit 5c5e23e8b258c9efb96b3574a4d52cf59b862b84
Author: law 
Date:   Fri Apr 7 14:26:05 2017 +

* config/iq2000/iq2000.c (final_prescan_insn): Do not separate a
CALL and NOTE_INSN_CALL_ARG_LOCATION.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@246761 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 099805d..d68f161 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2017-04-07  Jeff Law  
+
+   * config/iq2000/iq2000.c (final_prescan_insn): Do not separate a
+   CALL and NOTE_INSN_CALL_ARG_LOCATION.
+
 2017-04-07  Martin Liska  
 
PR target/79889
diff --git a/gcc/config/iq2000/iq2000.c b/gcc/config/iq2000/iq2000.c
index 7e1ba00..99abd76 100644
--- a/gcc/config/iq2000/iq2000.c
+++ b/gcc/config/iq2000/iq2000.c
@@ -1540,8 +1540,13 @@ final_prescan_insn (rtx_insn *insn, rtx opvec[] 
ATTRIBUTE_UNUSED,
|| (GET_CODE (PATTERN (insn)) == RETURN))
   && NEXT_INSN (PREV_INSN (insn)) == insn)
 {
-  rtx_insn *nop_insn = emit_insn_after (gen_nop (), insn);
+  rtx_insn *tmp = insn;
+  while (NEXT_INSN (tmp)
+&& NOTE_P (NEXT_INSN (tmp))
+&& NOTE_KIND (NEXT_INSN (tmp)) == NOTE_INSN_CALL_ARG_LOCATION)
+   tmp = NEXT_INSN (tmp);
 
+  rtx_insn *nop_insn = emit_insn_after (gen_nop (), tmp);
   INSN_ADDRESSES_NEW (nop_insn, -1);
 }
   


Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-07 Thread Ulrich Weigand
Dominik Vogt wrote:

> v4:
> 
>   * Remoce CCZZ1 iterator. 
>   * Remove duplicates of CS patterns. 
>   * Move the skip_cs_label so that output is moved to vtarget even 
> if the CS instruction was not used. 
>   * Removed leftover from "sne" (from an earlier version of the
>   * patch). 

Thanks, this looks quite good to me now.  I do still have two questions:

> +; Peephole to combine a load-and-test from volatile memory which combine does
> +; not do.
> +(define_peephole2
> +  [(set (match_operand:GPR 0 "register_operand")
> + (match_operand:GPR 2 "memory_operand"))
> +   (set (reg CC_REGNUM)
> + (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
> +  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
> +   && GENERAL_REG_P (operands[0])
> +   && satisfies_constraint_T (operands[2])"
> +  [(parallel
> +[(set (reg:CCS CC_REGNUM)
> +   (compare:CCS (match_dup 2) (match_dup 1)))
> + (set (match_dup 0) (match_dup 2))])])

Still wondering why this is necessary.  On the other hand, I guess it
cannot hurt to have the peephole either ...

> @@ -6518,13 +6533,30 @@
>[(parallel
>  [(set (match_operand:SI 0 "register_operand" "")
> (match_operator:SI 1 "s390_eqne_operator"
> -   [(match_operand:CCZ1 2 "register_operand")
> +   [(match_operand 2 "cc_reg_operand")
>   (match_operand 3 "const0_operand")]))
>   (clobber (reg:CC CC_REGNUM))])]
>""
> -  "emit_insn (gen_sne (operands[0], operands[2]));
> -   if (GET_CODE (operands[1]) == EQ)
> - emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> +  "machine_mode mode = GET_MODE (operands[2]);
> +   if (TARGET_Z196)
> + {
> +   rtx cond, ite;
> +
> +   if (GET_CODE (operands[1]) == NE)
> +  cond = gen_rtx_NE (VOIDmode, operands[2], const0_rtx);
> +   else
> +  cond = gen_rtx_EQ (VOIDmode, operands[2], const0_rtx);
> +   ite = gen_rtx_IF_THEN_ELSE (SImode, cond, const1_rtx, const0_rtx);
> +   emit_insn (gen_rtx_SET (operands[0], ite));
> + }
> +   else
> + {
> +   if (mode != CCZ1mode)
> +  FAIL;
> +   emit_insn (gen_sne (operands[0], operands[2]));
> +   if (GET_CODE (operands[1]) == EQ)
> +  emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> + }
> DONE;")

>From what I can see in the rest of the patch, none of the CS changes now
actually *rely* on this change to cstorecc4 ... s390_expand_cs_tdsi only
calls cstorecc4 on !TARGET_Z196, where the above change is a no-op, and
in the TARGET_Z196 case it deliberates does *not* use cstorecc4.

Now, in general this improvement to cstorecc4 is of course valuable
in itself.  But I think at this point it might be better to separate
this out into an independent patch (and measure its effect separately).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[PATCH] Add some missing AVX and AVX512F intrinsics (PR target/8032{2,3,5,6})

2017-04-07 Thread Jakub Jelinek
Hi!

Apparently while we weren't closely watching, Intel has added into
ICC various new intrinsics and they have been added into Clang
last fall as well.

Tested with
make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} i386.exp'
on KNL, will bootstrap/regtest on my Haswell-E next, ok for trunk
if that passes?

It is not a regression, on the other side it really shouldn't affect any
code that is not using those intrinsics.

2017-04-07  Jakub Jelinek  

PR target/80322
PR target/80323
PR target/80325
PR target/80326
* config/i386/avxintrin.h (_mm256_cvtsd_f64, _mm256_cvtss_f32): New
intrinsics.
* config/i386/avx512fintrin.h (_mm512_int2mask, _mm512_mask2int,
_mm512_abs_ps, _mm512_mask_abs_ps, _mm512_abs_pd, _mm512_mask_abs_pd,
_mm512_cvtsd_f64, _mm512_cvtss_f32): Likewise.

* gcc.target/i386/avx512f-undefined-1.c: New test.
* gcc.target/i386/avx512f-cvtsd-1.c: New test.
* gcc.target/i386/avx-cvtsd-1.c: New test.
* gcc.target/i386/avx512f-cvtss-1.c: New test.
* gcc.target/i386/avx512f-abspd-1.c: New test.
* gcc.target/i386/avx-cvtss-1.c: New test.
* gcc.target/i386/avx512f-absps-1.c: New test.
* gcc.target/i386/avx512f-int2mask-1.c: New test.
* gcc.target/i386/avx512f-mask2int-1.c: New test.

--- gcc/config/i386/avxintrin.h.jj  2017-01-01 12:45:42.0 +0100
+++ gcc/config/i386/avxintrin.h 2017-04-06 12:13:42.250717878 +0200
@@ -491,6 +491,20 @@ _mm256_cvttps_epi32 (__m256 __A)
   return (__m256i)__builtin_ia32_cvttps2dq256 ((__v8sf) __A);
 }
 
+extern __inline double
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_cvtsd_f64 (__m256d __A)
+{
+  return __A[0];
+}
+
+extern __inline float
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm256_cvtss_f32 (__m256 __A)
+{
+  return __A[0];
+}
+
 #ifdef __OPTIMIZE__
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm256_extractf128_pd (__m256d __X, const int __N)
--- gcc/config/i386/avx512fintrin.h.jj  2017-01-26 13:22:55.0 +0100
+++ gcc/config/i386/avx512fintrin.h 2017-04-06 15:25:03.941949154 +0200
@@ -60,6 +60,20 @@ typedef double __m512d_u __attribute__ (
 typedef unsigned char  __mmask8;
 typedef unsigned short __mmask16;
 
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_int2mask (int __M)
+{
+  return (__mmask16) __M;
+}
+
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask2int (__mmask16 __M)
+{
+  return (int) __M;
+}
+
 extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_set_epi64 (long long __A, long long __B, long long __C,
@@ -125,6 +139,8 @@ _mm512_undefined_ps (void)
   return __Y;
 }
 
+#define _mm512_undefined _mm512_undefined_ps
+
 extern __inline __m512d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_undefined_pd (void)
@@ -7264,6 +7280,39 @@ _mm512_mask_testn_epi64_mask (__mmask8 _
(__v8di) __B, __U);
 }
 
+extern __inline __m512
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_abs_ps (__m512 __A)
+{
+  return (__m512) _mm512_and_epi32 ((__m512i) __A,
+   _mm512_set1_epi32 (0x7fff));
+}
+
+extern __inline __m512
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_abs_ps (__m512 __W, __mmask16 __U, __m512 __A)
+{
+  return (__m512) _mm512_mask_and_epi32 ((__m512i) __W, __U, (__m512i) __A,
+_mm512_set1_epi32 (0x7fff));
+}
+
+extern __inline __m512d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_abs_pd (__m512 __A)
+{
+  return (__m512d) _mm512_and_epi64 ((__m512i) __A,
+_mm512_set1_epi64 (0x7fffLL));
+}
+
+extern __inline __m512d
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_mask_abs_pd (__m512d __W, __mmask8 __U, __m512 __A)
+{
+  return (__m512d)
+_mm512_mask_and_epi64 ((__m512i) __W, __U, (__m512i) __A,
+   _mm512_set1_epi64 (0x7fffLL));
+}
+
 extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_unpackhi_epi32 (__m512i __A, __m512i __B)
@@ -12011,6 +12060,20 @@ _mm512_maskz_cvtps_epu32 (__mmask16 __U,
 _MM_FROUND_CUR_DIRECTION);
 }
 
+extern __inline double
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cvtsd_f64 (__m512d __A)
+{
+  return __A[0];
+}
+
+extern __inline float
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_cvtss_f32 (__m512 __A)
+{
+  return __A[0];
+}
+
 #ifdef __x86_64__
 extern

[PATCH] Add _mm512_{,mask_}reduce_*_* intrinsics (PR target/80324)

2017-04-07 Thread Jakub Jelinek
Hi!

This patch is slightly larger, so I haven't included it in the patch I've
sent a few minutes ago.

I've looked at godbolt for what ICC generates for these and picked sequences
that generate approx. as good code as that.  For
min_epi64/max_epi64/min_epu64/max_epu64 there is a slight complication that
in AVX512F there is only _mm512_{min,max}_ep{i,u}64 but not the _mm256_ or
_mm_ ones, so we need to perform 512-bit operations all the time rather than
perform extractions, 256-bit operation, further extractions and then 128-bit
operations.

Seems we need to teach our permutation code further instructions, e.g.
typedef long long V __attribute__((vector_size (64)));
typedef int W __attribute__((vector_size (64)));
W f0 (W x) {
  return __builtin_shuffle (x, (W) { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 
4, 5, 6, 7 });
}
V f1 (V x) {
  return __builtin_shuffle (x, (V) { 4, 5, 6, 7, 0, 1, 2, 3 });
}
generate unnecessarily bad code (could use vpshufi64x2 instruction),
guess that can be resolved for GCC8.

Tested with 
   
make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
i386.exp'   
on KNL, will bootstrap/regtest on my Haswell-E next, ok for trunk   
   
if that passes? 
   

   
It is not a regression, on the other side it really shouldn't affect any
   
code that is not using those intrinsics.
   

2017-04-07  Jakub Jelinek  

PR target/80324
* config/i386/avx512fintrin.h (_mm512_reduce_add_epi32,
_mm512_reduce_mul_epi32, _mm512_reduce_and_epi32,
_mm512_reduce_or_epi32, _mm512_mask_reduce_add_epi32,
_mm512_mask_reduce_mul_epi32, _mm512_mask_reduce_and_epi32,
_mm512_mask_reduce_or_epi32, _mm512_reduce_min_epi32,
_mm512_reduce_max_epi32, _mm512_reduce_min_epu32,
_mm512_reduce_max_epu32, _mm512_mask_reduce_min_epi32,
_mm512_mask_reduce_max_epi32, _mm512_mask_reduce_min_epu32,
_mm512_mask_reduce_max_epu32, _mm512_reduce_add_ps,
_mm512_reduce_mul_ps, _mm512_mask_reduce_add_ps,
_mm512_mask_reduce_mul_ps, _mm512_reduce_min_ps, _mm512_reduce_max_ps,
_mm512_mask_reduce_min_ps, _mm512_mask_reduce_max_ps,
_mm512_reduce_add_epi64, _mm512_reduce_mul_epi64,
_mm512_reduce_and_epi64, _mm512_reduce_or_epi64,
_mm512_mask_reduce_add_epi64, _mm512_mask_reduce_mul_epi64,
_mm512_mask_reduce_and_epi64, _mm512_mask_reduce_or_epi64,
_mm512_reduce_min_epi64, _mm512_reduce_max_epi64,
_mm512_mask_reduce_min_epi64, _mm512_mask_reduce_max_epi64,
_mm512_reduce_min_epu64, _mm512_reduce_max_epu64,
_mm512_mask_reduce_min_epu64, _mm512_mask_reduce_max_epu64,
_mm512_reduce_add_pd, _mm512_reduce_mul_pd, _mm512_mask_reduce_add_pd,
_mm512_mask_reduce_mul_pd, _mm512_reduce_min_pd, _mm512_reduce_max_pd,
_mm512_mask_reduce_min_pd, _mm512_mask_reduce_max_pd): New intrinsics.

* gcc.target/i386/avx512f-reduce-op-1.c: New test.

--- gcc/config/i386/avx512fintrin.h.jj  2017-04-07 12:25:13.065643755 +0200
+++ gcc/config/i386/avx512fintrin.h 2017-04-07 16:34:37.976974227 +0200
@@ -13282,6 +13282,470 @@ _mm512_cmpgt_epu64_mask (__m512i __A, __
(__mmask8) -1);
 }
 
+#undef __MM512_REDUCE_OP
+#define __MM512_REDUCE_OP(op) \
+  __v8si __T1 = (__v8si) _mm512_extracti64x4_epi64 (__A, 1);   \
+  __v8si __T2 = (__v8si) _mm512_extracti64x4_epi64 (__A, 0);   \
+  __m256i __T3 = (__m256i) (__T1 op __T2); \
+  __v4si __T4 = (__v4si) _mm256_extracti128_si256 (__T3, 1);   \
+  __v4si __T5 = (__v4si) _mm256_extracti128_si256 (__T3, 0);   \
+  __v4si __T6 = __T4 op __T5;  \
+  __v4si __T7 = __builtin_shuffle (__T6, (__v4si) { 2, 3, 0, 1 }); \
+  __v4si __T8 = __T6 op __T7;  \
+  return __T8[0] op __T8[1]
+
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_reduce_add_epi32 (__m512i __A)
+{
+  __MM512_REDUCE_OP (+);
+}
+
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_reduce_mul_epi32 (__m512i __A)
+{
+  __MM512_REDUCE_OP (*);
+}
+
+extern __inline int
+

[PATCH] Fix avx512f-vgetmantpd-2.c testcase with -m32

2017-04-07 Thread Jakub Jelinek
Hi!

I've noticed the avx512f-vgetmantpd-2.c testcase eats lots of
CPU time on KNL with -m32 and FAILs.  The problem is
excess precision, I've added -mfpmath=sse to it to fix that.

Tested with 
   
make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
i386.exp'   
on KNL, ok for trunk?

2017-04-07  Jakub Jelinek  

* gcc.target/i386/avx512f-vgetmantps-2.c: Add -mfpmath=sse to
dg-options.
* gcc.target/i386/avx512f-vgetmantpd-2.c: Likewise.

--- gcc/testsuite/gcc.target/i386/avx512f-vgetmantps-2.c.jj 2017-04-07 
05:52:04.0 -0400
+++ gcc/testsuite/gcc.target/i386/avx512f-vgetmantps-2.c2017-04-07 
09:22:13.051209011 -0400
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -mavx512f -std=c99" } */
+/* { dg-options "-O2 -mavx512f -std=c99 -mfpmath=sse" } */
 /* { dg-require-effective-target avx512f } */
 /* { dg-require-effective-target c99_runtime } */
 
--- gcc/testsuite/gcc.target/i386/avx512f-vgetmantpd-2.c.jj 2017-04-07 
05:51:49.0 -0400
+++ gcc/testsuite/gcc.target/i386/avx512f-vgetmantpd-2.c2017-04-07 
09:21:46.648317195 -0400
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -mavx512f -std=c99" } */
+/* { dg-options "-O2 -mavx512f -std=c99 -mfpmath=sse" } */
 /* { dg-require-effective-target avx512f } */
 /* { dg-require-effective-target c99_runtime } */
 

Jakub


Re: [patch,gomp4] add support for fortran common blocks

2017-04-07 Thread Thomas Schwinge
Hi Cesar!

On Wed, 5 Apr 2017 13:37:30 -0700, Cesar Philippidis  
wrote:
> On 04/05/2017 01:22 PM, Thomas Schwinge wrote:
> 
> >> --- a/gcc/gimplify.c
> >> +++ b/gcc/gimplify.c
> >> @@ -6102,14 +6102,19 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, 
> >> tree decl, unsigned flags)
> >>  {
> >>const char *rkind;
> >>bool on_device = false;
> >> +  bool is_private = false;
> > 
> > So the intention here is that by default everything stays the same as
> > before; "is_private == false".  This property is satisfied in the
> > following code.
> 
> Yes.
> 
> >>tree type = TREE_TYPE (decl);
> >>  
> >>if (lang_hooks.decls.omp_privatize_by_reference (decl))
> >>  type = TREE_TYPE (type);
> >>  
> >> +  if (RECORD_OR_UNION_TYPE_P (type))
> >> +is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> >> +
> >>if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
> >>&& is_global_var (decl)
> >> -  && device_resident_p (decl))
> >> +  && device_resident_p (decl)
> >> +  && !is_private)
> >>  {
> >>on_device = true;
> >>flags |= GOVD_MAP_TO_ONLY;
> > |  }
> > 
> > For "is_private == true" we will not possibly enter this block.

So, is this an invalid scenario (and should thus get an "assert" or
"gcc_unreachable"), or is it correct, if "private", to not set
"on_device" and "GOVD_MAP_TO_ONLY" for "device_resident_p" DECLs?

> > | [ORT_ACC_KERNELS]
> >>/* Scalars are default 'copy' under kernels, non-scalars are default
> >> 'present_or_copy'.  */
> >>flags |= GOVD_MAP;
> >> -  if (!AGGREGATE_TYPE_P (type))
> >> +  if (!AGGREGATE_TYPE_P (type) && !is_private)
> >>flags |= GOVD_MAP_FORCE;
> > 
> > For "is_private == true" we will not possibly enter this block, which
> > means in this case we will map both scalars and aggregates as
> > "present_or_copy".
> 
> Yes. Inside kernels regions, everything is pcopy, unless it's private.

But I'm saying/understanding the code so that inside kernels regions,
"private" stuff is "present_or_copy".  Is that correct or not?

> Some private variables include, e.g., fortran array descriptors.

(I'd prefer if we had tree-dump scanning tests for such things.)

> >>  case ORT_ACC_PARALLEL:
> >>{
> >> -  if (on_device || AGGREGATE_TYPE_P (type))
> >> +  if (!is_private && (on_device || AGGREGATE_TYPE_P (type)))
> >>  /* Aggregates default to 'present_or_copy'.  */
> >>  flags |= GOVD_MAP;
> >>else
> > | /* Scalars default to 'firstprivate'.  */
> > | flags |= GOVD_FIRSTPRIVATE;
> > 
> > For "is_private == true" we will not possibly enter the "if" block, so we
> > will always enter the "else" block, which means in this case we will map
> > both scalars and aggregates as "firstprivate".
> > 
> > Is that all correct?
> 
> Yes. Is there something wrong with that behavior

Again: I'm confused as to why inside kernels regions, "private" stuff is
mapped "present_or_copy", but inside parallel regions, it's
"firstprivate".

> or is it just unclear?

In gomp-4_0-branch r246762, I committed the following patch, to make
better apparent the current behavior:

commit a87af0655eb2f803c330d807cdca698ab597b44e
Author: tschwinge 
Date:   Fri Apr 7 14:55:56 2017 +

Clarify gcc/gimplify.c:oacc_default_clause

gcc/
* gimplify.c (oacc_default_clause): Clarify.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@246762 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  4 
 gcc/gimplify.c | 44 ++--
 2 files changed, 30 insertions(+), 18 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index effcc05..811e190 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2017-04-07  Thomas Schwinge  
+
+   * gimplify.c (oacc_default_clause): Clarify.
+
 2017-04-05  Cesar Philippidis  
 
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
diff --git gcc/gimplify.c gcc/gimplify.c
index 604a6cb..f2bb814 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6129,30 +6129,38 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
decl, unsigned flags)
 
   switch (ctx->region_type)
 {
-default:
-  gcc_unreachable ();
-
 case ORT_ACC_KERNELS:
-  /* Scalars are default 'copy' under kernels, non-scalars are default
-'present_or_copy'.  */
-  flags |= GOVD_MAP;
-  if (!AGGREGATE_TYPE_P (type) && !is_private)
-   flags |= GOVD_MAP_FORCE;
-
   rkind = "kernels";
+
+  if (is_private)
+   flags |= GOVD_MAP;
+  else if (AGGREGATE_TYPE_P (type))
+   /* Aggregates default to 'present_or_copy'.  */
+   flags |= GOVD_MAP;
+  else
+   /* Scalars default to 'copy'.  */
+   flags |= GOVD_MAP | GOVD_MAP_FORCE;
+
   break;
 
 case ORT_ACC_PARALLEL:
-  {
-   if (!is_private && (on_device || AGGREGATE_TYPE_P (type)

Re: [PATCH] Add some missing AVX and AVX512F intrinsics (PR target/8032{2,3,5,6})

2017-04-07 Thread Uros Bizjak
On Fri, Apr 7, 2017 at 4:44 PM, Jakub Jelinek  wrote:
> Hi!
>
> Apparently while we weren't closely watching, Intel has added into
> ICC various new intrinsics and they have been added into Clang
> last fall as well.
>
> Tested with
> make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> i386.exp'
> on KNL, will bootstrap/regtest on my Haswell-E next, ok for trunk
> if that passes?
>
> It is not a regression, on the other side it really shouldn't affect any
> code that is not using those intrinsics.
>
> 2017-04-07  Jakub Jelinek  
>
> PR target/80322
> PR target/80323
> PR target/80325
> PR target/80326
> * config/i386/avxintrin.h (_mm256_cvtsd_f64, _mm256_cvtss_f32): New
> intrinsics.
> * config/i386/avx512fintrin.h (_mm512_int2mask, _mm512_mask2int,
> _mm512_abs_ps, _mm512_mask_abs_ps, _mm512_abs_pd, _mm512_mask_abs_pd,
> _mm512_cvtsd_f64, _mm512_cvtss_f32): Likewise.
>
> * gcc.target/i386/avx512f-undefined-1.c: New test.
> * gcc.target/i386/avx512f-cvtsd-1.c: New test.
> * gcc.target/i386/avx-cvtsd-1.c: New test.
> * gcc.target/i386/avx512f-cvtss-1.c: New test.
> * gcc.target/i386/avx512f-abspd-1.c: New test.
> * gcc.target/i386/avx-cvtss-1.c: New test.
> * gcc.target/i386/avx512f-absps-1.c: New test.
> * gcc.target/i386/avx512f-int2mask-1.c: New test.
> * gcc.target/i386/avx512f-mask2int-1.c: New test.

LGTM.

Thanks,
Uros.

> --- gcc/config/i386/avxintrin.h.jj  2017-01-01 12:45:42.0 +0100
> +++ gcc/config/i386/avxintrin.h 2017-04-06 12:13:42.250717878 +0200
> @@ -491,6 +491,20 @@ _mm256_cvttps_epi32 (__m256 __A)
>return (__m256i)__builtin_ia32_cvttps2dq256 ((__v8sf) __A);
>  }
>
> +extern __inline double
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_cvtsd_f64 (__m256d __A)
> +{
> +  return __A[0];
> +}
> +
> +extern __inline float
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_cvtss_f32 (__m256 __A)
> +{
> +  return __A[0];
> +}
> +
>  #ifdef __OPTIMIZE__
>  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm256_extractf128_pd (__m256d __X, const int __N)
> --- gcc/config/i386/avx512fintrin.h.jj  2017-01-26 13:22:55.0 +0100
> +++ gcc/config/i386/avx512fintrin.h 2017-04-06 15:25:03.941949154 +0200
> @@ -60,6 +60,20 @@ typedef double __m512d_u __attribute__ (
>  typedef unsigned char  __mmask8;
>  typedef unsigned short __mmask16;
>
> +extern __inline __mmask16
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_int2mask (int __M)
> +{
> +  return (__mmask16) __M;
> +}
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_mask2int (__mmask16 __M)
> +{
> +  return (int) __M;
> +}
> +
>  extern __inline __m512i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_set_epi64 (long long __A, long long __B, long long __C,
> @@ -125,6 +139,8 @@ _mm512_undefined_ps (void)
>return __Y;
>  }
>
> +#define _mm512_undefined _mm512_undefined_ps
> +
>  extern __inline __m512d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_undefined_pd (void)
> @@ -7264,6 +7280,39 @@ _mm512_mask_testn_epi64_mask (__mmask8 _
> (__v8di) __B, __U);
>  }
>
> +extern __inline __m512
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_abs_ps (__m512 __A)
> +{
> +  return (__m512) _mm512_and_epi32 ((__m512i) __A,
> +   _mm512_set1_epi32 (0x7fff));
> +}
> +
> +extern __inline __m512
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_mask_abs_ps (__m512 __W, __mmask16 __U, __m512 __A)
> +{
> +  return (__m512) _mm512_mask_and_epi32 ((__m512i) __W, __U, (__m512i) __A,
> +_mm512_set1_epi32 (0x7fff));
> +}
> +
> +extern __inline __m512d
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_abs_pd (__m512 __A)
> +{
> +  return (__m512d) _mm512_and_epi64 ((__m512i) __A,
> +_mm512_set1_epi64 
> (0x7fffLL));
> +}
> +
> +extern __inline __m512d
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_mask_abs_pd (__m512d __W, __mmask8 __U, __m512 __A)
> +{
> +  return (__m512d)
> +_mm512_mask_and_epi64 ((__m512i) __W, __U, (__m512i) __A,
> +   _mm512_set1_epi64 (0x7fffLL));
> +}
> +
>  extern __inline __m512i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_unpackhi_epi32 (__m512i __A, __m512i __B)
> @@ -12011,6 +12060,20 @@ _mm512_maskz_cvtps_epu32 (__mmask16 __U,
>  
> _MM_FROUND_CUR_DIRECT

Re: [PATCH] Fix avx512f-vgetmantpd-2.c testcase with -m32

2017-04-07 Thread Uros Bizjak
On Fri, Apr 7, 2017 at 4:54 PM, Jakub Jelinek  wrote:
> Hi!
>
> I've noticed the avx512f-vgetmantpd-2.c testcase eats lots of
> CPU time on KNL with -m32 and FAILs.  The problem is
> excess precision, I've added -mfpmath=sse to it to fix that.
>
> Tested with
> make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> i386.exp'
> on KNL, ok for trunk?
>
> 2017-04-07  Jakub Jelinek  
>
> * gcc.target/i386/avx512f-vgetmantps-2.c: Add -mfpmath=sse to
> dg-options.
> * gcc.target/i386/avx512f-vgetmantpd-2.c: Likewise.

OK.

Thanks,
Uros.

> --- gcc/testsuite/gcc.target/i386/avx512f-vgetmantps-2.c.jj 2017-04-07 
> 05:52:04.0 -0400
> +++ gcc/testsuite/gcc.target/i386/avx512f-vgetmantps-2.c2017-04-07 
> 09:22:13.051209011 -0400
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-options "-O2 -mavx512f -std=c99" } */
> +/* { dg-options "-O2 -mavx512f -std=c99 -mfpmath=sse" } */
>  /* { dg-require-effective-target avx512f } */
>  /* { dg-require-effective-target c99_runtime } */
>
> --- gcc/testsuite/gcc.target/i386/avx512f-vgetmantpd-2.c.jj 2017-04-07 
> 05:51:49.0 -0400
> +++ gcc/testsuite/gcc.target/i386/avx512f-vgetmantpd-2.c2017-04-07 
> 09:21:46.648317195 -0400
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-options "-O2 -mavx512f -std=c99" } */
> +/* { dg-options "-O2 -mavx512f -std=c99 -mfpmath=sse" } */
>  /* { dg-require-effective-target avx512f } */
>  /* { dg-require-effective-target c99_runtime } */
>
>
> Jakub


Re: [PATCH] Add _mm512_{,mask_}reduce_*_* intrinsics (PR target/80324)

2017-04-07 Thread Uros Bizjak
On Fri, Apr 7, 2017 at 4:52 PM, Jakub Jelinek  wrote:
> Hi!
>
> This patch is slightly larger, so I haven't included it in the patch I've
> sent a few minutes ago.
>
> I've looked at godbolt for what ICC generates for these and picked sequences
> that generate approx. as good code as that.  For
> min_epi64/max_epi64/min_epu64/max_epu64 there is a slight complication that
> in AVX512F there is only _mm512_{min,max}_ep{i,u}64 but not the _mm256_ or
> _mm_ ones, so we need to perform 512-bit operations all the time rather than
> perform extractions, 256-bit operation, further extractions and then 128-bit
> operations.
>
> Seems we need to teach our permutation code further instructions, e.g.
> typedef long long V __attribute__((vector_size (64)));
> typedef int W __attribute__((vector_size (64)));
> W f0 (W x) {
>   return __builtin_shuffle (x, (W) { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 
> 3, 4, 5, 6, 7 });
> }
> V f1 (V x) {
>   return __builtin_shuffle (x, (V) { 4, 5, 6, 7, 0, 1, 2, 3 });
> }
> generate unnecessarily bad code (could use vpshufi64x2 instruction),
> guess that can be resolved for GCC8.
>
> Tested with
> make -j272 -k check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
> i386.exp'
> on KNL, will bootstrap/regtest on my Haswell-E next, ok for trunk
> if that passes?
>
> It is not a regression, on the other side it really shouldn't affect any
> code that is not using those intrinsics.
>
> 2017-04-07  Jakub Jelinek  
>
> PR target/80324
> * config/i386/avx512fintrin.h (_mm512_reduce_add_epi32,
> _mm512_reduce_mul_epi32, _mm512_reduce_and_epi32,
> _mm512_reduce_or_epi32, _mm512_mask_reduce_add_epi32,
> _mm512_mask_reduce_mul_epi32, _mm512_mask_reduce_and_epi32,
> _mm512_mask_reduce_or_epi32, _mm512_reduce_min_epi32,
> _mm512_reduce_max_epi32, _mm512_reduce_min_epu32,
> _mm512_reduce_max_epu32, _mm512_mask_reduce_min_epi32,
> _mm512_mask_reduce_max_epi32, _mm512_mask_reduce_min_epu32,
> _mm512_mask_reduce_max_epu32, _mm512_reduce_add_ps,
> _mm512_reduce_mul_ps, _mm512_mask_reduce_add_ps,
> _mm512_mask_reduce_mul_ps, _mm512_reduce_min_ps, _mm512_reduce_max_ps,
> _mm512_mask_reduce_min_ps, _mm512_mask_reduce_max_ps,
> _mm512_reduce_add_epi64, _mm512_reduce_mul_epi64,
> _mm512_reduce_and_epi64, _mm512_reduce_or_epi64,
> _mm512_mask_reduce_add_epi64, _mm512_mask_reduce_mul_epi64,
> _mm512_mask_reduce_and_epi64, _mm512_mask_reduce_or_epi64,
> _mm512_reduce_min_epi64, _mm512_reduce_max_epi64,
> _mm512_mask_reduce_min_epi64, _mm512_mask_reduce_max_epi64,
> _mm512_reduce_min_epu64, _mm512_reduce_max_epu64,
> _mm512_mask_reduce_min_epu64, _mm512_mask_reduce_max_epu64,
> _mm512_reduce_add_pd, _mm512_reduce_mul_pd, _mm512_mask_reduce_add_pd,
> _mm512_mask_reduce_mul_pd, _mm512_reduce_min_pd, _mm512_reduce_max_pd,
> _mm512_mask_reduce_min_pd, _mm512_mask_reduce_max_pd): New intrinsics.
>
> * gcc.target/i386/avx512f-reduce-op-1.c: New test.

LGTM, but please wait for Kirill's opinion on the implementation.

Thanks,
Uros.

> --- gcc/config/i386/avx512fintrin.h.jj  2017-04-07 12:25:13.065643755 +0200
> +++ gcc/config/i386/avx512fintrin.h 2017-04-07 16:34:37.976974227 +0200
> @@ -13282,6 +13282,470 @@ _mm512_cmpgt_epu64_mask (__m512i __A, __
> (__mmask8) -1);
>  }
>
> +#undef __MM512_REDUCE_OP
> +#define __MM512_REDUCE_OP(op) \
> +  __v8si __T1 = (__v8si) _mm512_extracti64x4_epi64 (__A, 1);   \
> +  __v8si __T2 = (__v8si) _mm512_extracti64x4_epi64 (__A, 0);   \
> +  __m256i __T3 = (__m256i) (__T1 op __T2); \
> +  __v4si __T4 = (__v4si) _mm256_extracti128_si256 (__T3, 1);   \
> +  __v4si __T5 = (__v4si) _mm256_extracti128_si256 (__T3, 0);   \
> +  __v4si __T6 = __T4 op __T5;  \
> +  __v4si __T7 = __builtin_shuffle (__T6, (__v4si) { 2, 3, 0, 1 }); \
> +  __v4si __T8 = __T6 op __T7;  \
> +  return __T8[0] op __T8[1]
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_reduce_add_epi32 (__m512i __A)
> +{
> +  __MM512_REDUCE_OP (+);
> +}
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_reduce_mul_epi32 (__m512i __A)
> +{
> +  __MM512_REDUCE_OP (*);
> +}
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_reduce_and_epi32 (__m512i __A)
> +{
> +  __MM512_REDUCE_OP (&);
> +}
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_reduce_or_epi32 (__m512i __A)
> +{
> +  __MM512_REDUCE_OP (|);
> +}
> +
> +extern __inline int
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_m

OpenACC 2.5 default (present) clause

2017-04-07 Thread Thomas Schwinge
Hi!

OpenACC 2.5 added a default (present) clause, which "causes all arrays or
variables of aggregate data type used in the compute construct that have
implicitly determined data attributes to be treated as if they appeared
in a present clause".  Preceded by the following cleanup patch (see
 for its
origin), OK for trunk in next stage 1?

commit 787fea9e71f693c1b629a699f8476f392c4bc55d
Author: Thomas Schwinge 
Date:   Thu Apr 6 07:58:37 2017 +0200

Clarify gcc/gimplify.c:oacc_default_clause

gcc/
* gimplify.c (oacc_default_clause): Clarify.
---
 gcc/gimplify.c | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git gcc/gimplify.c gcc/gimplify.c
index ff8d56b..3fcf9ab 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -6929,30 +6929,34 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree 
decl, unsigned flags)
 
   switch (ctx->region_type)
 {
-default:
-  gcc_unreachable ();
-
 case ORT_ACC_KERNELS:
-  /* Scalars are default 'copy' under kernels, non-scalars are default
-'present_or_copy'.  */
-  flags |= GOVD_MAP;
-  if (!AGGREGATE_TYPE_P (type))
-   flags |= GOVD_MAP_FORCE;
-
   rkind = "kernels";
+
+  if (AGGREGATE_TYPE_P (type))
+   /* Aggregates default to 'present_or_copy'.  */
+   flags |= GOVD_MAP;
+  else
+   /* Scalars default to 'copy'.  */
+   flags |= GOVD_MAP | GOVD_MAP_FORCE;
+
   break;
 
 case ORT_ACC_PARALLEL:
-  {
-   if (on_device || AGGREGATE_TYPE_P (type) || declared)
- /* Aggregates default to 'present_or_copy'.  */
- flags |= GOVD_MAP;
-   else
- /* Scalars default to 'firstprivate'.  */
- flags |= GOVD_FIRSTPRIVATE;
-   rkind = "parallel";
-  }
+  rkind = "parallel";
+
+  if (on_device || declared)
+   flags |= GOVD_MAP;
+  else if (AGGREGATE_TYPE_P (type))
+   /* Aggregates default to 'present_or_copy'.  */
+   flags |= GOVD_MAP;
+  else
+   /* Scalars default to 'firstprivate'.  */
+   flags |= GOVD_FIRSTPRIVATE;
+
   break;
+
+default:
+  gcc_unreachable ();
 }
 
   if (DECL_ARTIFICIAL (decl))

commit 4f7643dd63177ba9fdb99063e34f8e287c2e30aa
Author: Thomas Schwinge 
Date:   Wed Apr 5 15:33:17 2017 +0200

OpenACC 2.5 default (present) clause

gcc/c/
* c-parser.c (c_parser_omp_clause_default): Handle
"OMP_CLAUSE_DEFAULT_PRESENT".
gcc/cp/
* parser.c (cp_parser_omp_clause_default): Handle
"OMP_CLAUSE_DEFAULT_PRESENT".
gcc/fortran/
* gfortran.h (enum gfc_omp_default_sharing): Add
"OMP_DEFAULT_PRESENT".
* dump-parse-tree.c (show_omp_clauses): Handle it.
* openmp.c (gfc_match_omp_clauses): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses): Likewise.
gcc/
* tree-core.h (enum omp_clause_default_kind): Add
"OMP_CLAUSE_DEFAULT_PRESENT".
* tree-pretty-print.c (dump_omp_clause): Handle it.
* gimplify.c (enum gimplify_omp_var_data): Add
"GOVD_MAP_FORCE_PRESENT".
(gimplify_adjust_omp_clauses_1): Map it to
"GOMP_MAP_FORCE_PRESENT".
(omp_default_clause, oacc_default_clause): Handle
"OMP_CLAUSE_DEFAULT_PRESENT".
gcc/testsuite/
* c-c++-common/goacc/default-1.c: Update.
* c-c++-common/goacc/default-2.c: Likewise.
* c-c++-common/goacc/default-4.c: Likewise.
* gfortran.dg/goacc/default-1.f95: Likewise.
* gfortran.dg/goacc/default-4.f: Likewise.
* c-c++-common/goacc/default-5.c: New file.
* gfortran.dg/goacc/default-5.f: Likewise.
libgomp/
* testsuite/libgomp.oacc-c++/template-reduction.C: Update.
* testsuite/libgomp.oacc-c-c++-common/nested-2.c: Update.
* testsuite/libgomp.oacc-fortran/data-4-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/default-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/non-scalar-data.f90: Likewise.
---
 gcc/c/c-parser.c   | 14 --
 gcc/cp/parser.c| 14 --
 gcc/fortran/dump-parse-tree.c  |  1 +
 gcc/fortran/gfortran.h |  3 +-
 gcc/fortran/openmp.c   | 20 +---
 gcc/fortran/trans-openmp.c |  3 ++
 gcc/gimplify.c | 53 ++
 gcc/testsuite/c-c++-common/goacc/default-1.c   |  5 ++
 gcc/testsuite/c-c++-common/goacc/default-2.c   | 28 ++--
 gcc/testsuite/c-c++-common/goacc/default-4.c   | 21 +
 gcc/testsuite/c-c++-common/goacc/default-5.c   | 20 
 gcc/testsuit

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Richard Biener
On April 7, 2017 3:37:30 PM GMT+02:00, Bernd Edlinger 
 wrote:
>On 04/07/17 08:54, Richard Biener wrote:
>> On Thu, 6 Apr 2017, Bernd Edlinger wrote:
>>> I think get_alias_set(t) will return 0 for typeless_storage
>>> types, and therefore has_zero_child will be set anyway.
>>> I think both mean the same thing in the end, but it depends on
>>> what typeless_storage should actually mean, and we have
>>> not yet the same idea about it.
>>
>> But has_zero_child does not do what we like it to because otherwise
>> in the PR using the char[] array member would have worked!
>>
>> has_zero_child doesn't do that on purpose of course, but this means
>> returing alias-set zero for the typeless storage _member_ doesn't
>> suffice.
>>
>
>I see you have a certain idea how to solve the C++17 issue.
>And yes, I apologize, if I tried to pee on your tree :)

We do have the need to support this part of the C++ standard.  For other user 
code may_alias suffices and I see no reason to haste inventing sth new without 
a single convincing testcase.  GCC/Language extensions should not be added 
without a good reason.

I didn't propose to expose the type flag to users at all.

Richard.

>What you propose is I think the following:
>The C++ FE sets TYPE_TYPELESS_STORAGE a std::byte
>and on "unsigned char" if the language dialect is cxx17
>and the TBAA makes all the rest.
>
>What I propose is as follows:
>The TYPE_TYPELESS_STORAGE is a generic attribute, it
>can be set on any type, and in the TBAA the attribute
>does not squirrel around at all.  If it is on a type,
>then all DECLs with this type get the alias set 0.
>If it is on a member of a struct that does not mean
>more than if the struct has a char member this it
>sets has_zero_child, which I do not want to mean
>anything else than before.
>
>The C++ FE does the business logic here, in deciding
>where to distribute the TYPE_TYPELESS_STORAGE flags.
>
>in this example
>class A {
>   class B {
> std::byte x[5];
>   } b;
>};
>
>std::byte, class B, and class A would get the
>TYPE_TYPELESS_STORAGE flag set by the C++FE if
>the language dialect is cxx17 or above,
>so that you can place anything into any object
>of class A and class B, and of type std::byte.
>
>but in this example
>class B {
>   std::byte x;
>};
>
>only std::byte would get the TYPE_TYPELESS_STORAGE
>flag, so you can not put anyting into an object
>of class B, just on an object of std::byte.
>
>
>
>>>
>>> I wanted to be able to declare a int
>__attribute__((typeless_storage))
>>> as in the test case, and the sample in the spec.  And that
>>> information is not in the TYPE_MAIN_VARIANT.  Therefore I look for
>>> typeless_storage before "t = TYPE_MAIN_VARIANT (t)".
>>
>> As I said I believe this is a useless feature.  If you want something
>> typeless then the underlying type doesn't matter so we can as well
>> force it to be an array of char.  Makes our live simpler.  And
>> even makes the code portable to compilers that treat arrays of char
>> conservatively.
>>
>
>I just learned that the C11 standard does not guarantee that, and also
>an array of char does not provide the necessary alignment per se, at
>least without alignment attributes.
>
>>>
>>> See cxx_type_contains_byte_buffer: this function looks recursively
>into
>>> structures and unions, and returns the information if the beast
>>> contains an array of unsigned char or std::byte.
>>
>> But with a properly designed middle-end feature that's not needed.
>>
>> There's technically no reason to pessimize TBAA for anything but
>> the typeless storage member of a structure.
>>
>
>Yes, it is just a matter of taste.  And if you want the middle
>end to be flexible here or if everything should work without user
>intervention.
>
>

 @@ -1491,6 +1491,7 @@ struct GTY(()) tree_type_common {
unsigned needs_constructing_flag : 1;
unsigned transparent_aggr_flag : 1;
unsigned restrict_flag : 1;
 +  unsigned typeless_storage_flag : 1;
unsigned contains_placeholder_bits : 2;

ENUM_BITFIELD(machine_mode) mode : 8;

 bits are grouped in groups of 8 bits, this breaks it.

>>>
>>> Oh..., does this explain the problems that I had with this
>version???
>>
>> No, just "cosmetics".
>>
 @@ -8041,7 +8041,8 @@ build_pointer_type_for_mode (tree to_type,
>machine

/* If the pointed-to type has the may_alias attribute set, force
   a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  */
 -  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (to_type)))
 +  if (TYPE_TYPELESS_STORAGE (to_type)
 +  || lookup_attribute ("may_alias", TYPE_ATTRIBUTES
>(to_type)))
  can_alias_all = true;

/* In some cases, languages will have things that aren't a
>POINTER_TYPE
 @@ -8110,7 +8111,8 @@ build_reference_type_for_mode (tree to_type,
>machi

/* If the pointed-to type has the may_alias attribute set, force
   a TYPE_REF_CAN_ALIAS_ALL pointer to be generated.  

Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Bernd Edlinger
On 04/07/17 17:10, Richard Biener wrote:
> On April 7, 2017 3:37:30 PM GMT+02:00, Bernd Edlinger 
>  wrote:
>> On 04/07/17 08:54, Richard Biener wrote:
>>> On Thu, 6 Apr 2017, Bernd Edlinger wrote:
 I think get_alias_set(t) will return 0 for typeless_storage
 types, and therefore has_zero_child will be set anyway.
 I think both mean the same thing in the end, but it depends on
 what typeless_storage should actually mean, and we have
 not yet the same idea about it.
>>>
>>> But has_zero_child does not do what we like it to because otherwise
>>> in the PR using the char[] array member would have worked!
>>>
>>> has_zero_child doesn't do that on purpose of course, but this means
>>> returing alias-set zero for the typeless storage _member_ doesn't
>>> suffice.
>>>
>>
>> I see you have a certain idea how to solve the C++17 issue.
>> And yes, I apologize, if I tried to pee on your tree :)
>
> We do have the need to support this part of the C++ standard.  For other user 
> code may_alias suffices and I see no reason to haste inventing sth new 
> without a single convincing testcase.  GCC/Language extensions should not be 
> added without a good reason.
>
> I didn't propose to expose the type flag to users at all.
>
> Richard.
>

Well, actually you did:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79671#c100

But I won't argue if you prefer to do it as you like,
after all TBAA is your area.  So that is fine for me.

Attached is the latest version of my patch, I fixed
the issue with the type-flag conversion, and it is
already fully functional.


Bernd.
gcc
2017-04-05  Bernd Edlinger  

* doc/extend.texi: Document the typeless_storage type attribute.
* alias.c (get_alias_set): Handle the typeless_storage attribute.
(record_component_aliases): Likewise.
* tree-streamer-in.c (unpack_ts_type_common_value_fields): Likewise.
* tree-streamer-out.c (pack_ts_type_common_value_fields): Likewise.
* tree.h (TYPE_TYPELESS_STORAGE): New access macro.
* tree-core.h: Update comment.

gcc/c-family
2017-04-05  Bernd Edlinger  

* c-attribs.c (c_common_attribute_tab): Add the typeless_storage
attribute.
(handle_typeless_storage_attribute): New function.

gcc/cp
2017-04-05  Bernd Edlinger  

* class.c (fixup_attribute_variants): Handle the typeless_storage
attribute.
(finish_struct_1): Set the typeless_storage attribute if required
by C++17.
* decl.c (start_enum): Likewise.
* pt.c (lookup_template_class_1): Handle the typeless_storage
attribute.
* typeck2.c (cxx_type_contains_byte_buffer): New function.
* cp-tree.h (cxx_type_contains_byte_buffer): Declare.

gcc/lto
2017-04-05  Bernd Edlinger  

* lto.c (compare_tree_sccs_1, hash_tree): Handle the typeless_storage
attribute.

gcc/testsuite
2017-04-05  Bernd Edlinger  

* c-c++-common/attr-typeless-storage-1.c: New test.
* c-c++-common/attr-typeless-storage-2.c: New test.
* gcc.c-torture/execute/typeless-storage-1.c: New test.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 246678)
+++ gcc/doc/extend.texi	(working copy)
@@ -6656,6 +6656,36 @@
 @option{-fstrict-aliasing}, which is on by default at @option{-O2} or
 above.
 
+@item typeless_storage
+@cindex @code{typeless_storage} type attribute
+In the context of section 6.5 paragraph 6 of the C11 standard,
+an object of this type behaves as if it has no declared type.
+In the context of section 6.5 paragraph 7 of the C11 standard,
+an object or a pointer if this type behaves as if it were a
+character type.
+This attribute is similar to the @code{may_alias} attribute,
+except that it is not restricted to pointers.
+
+Example of use:
+
+@smallexample
+typedef int __attribute__((__typeless_storage__)) int_a;
+
+int
+main (void)
+@{
+  int_a a = 0x12345678;
+  short *b = (short *) &a;
+
+  b[1] = 0;
+
+  if (a == 0x12345678)
+abort();
+
+  exit(0);
+@}
+@end smallexample
+
 @item packed
 @cindex @code{packed} type attribute
 This attribute, attached to @code{struct} or @code{union} type
Index: gcc/alias.c
===
--- gcc/alias.c	(revision 246678)
+++ gcc/alias.c	(working copy)
@@ -879,6 +879,10 @@ get_alias_set (tree t)
   t = TREE_TYPE (t);
 }
 
+  /* Honor the typeless_storage type attribute.  */
+  if (TYPE_TYPELESS_STORAGE (t))
+return 0;
+
   /* Variant qualifiers don't affect the alias set, so get the main
  variant.  */
   t = TYPE_MAIN_VARIANT (t);
@@ -1234,7 +1238,8 @@ record_component_aliases (tree type)
 		/* VECTOR_TYPE and ARRAY_TYPE share the alias set with their
 		   element type and that type has to be normalized to void *,
 		   too, in the case it is a pointer. */
-		while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t))
+		while (!canonical_type_used_

Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-07 Thread Dominik Vogt
On Fri, Apr 07, 2017 at 04:34:44PM +0200, Ulrich Weigand wrote:
> > +; Peephole to combine a load-and-test from volatile memory which combine 
> > does
> > +; not do.
> > +(define_peephole2
> > +  [(set (match_operand:GPR 0 "register_operand")
> > +   (match_operand:GPR 2 "memory_operand"))
> > +   (set (reg CC_REGNUM)
> > +   (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
> > +  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
> > +   && GENERAL_REG_P (operands[0])
> > +   && satisfies_constraint_T (operands[2])"
> > +  [(parallel
> > +[(set (reg:CCS CC_REGNUM)
> > + (compare:CCS (match_dup 2) (match_dup 1)))
> > + (set (match_dup 0) (match_dup 2))])])
> 
> Still wondering why this is necessary.

It's necessary vecause Combine refuses to match anything that
contains a volatile memory reference, using a global flag for
Recog.

> > @@ -6518,13 +6533,30 @@
> >[(parallel
> >  [(set (match_operand:SI 0 "register_operand" "")
> >   (match_operator:SI 1 "s390_eqne_operator"
> > -   [(match_operand:CCZ1 2 "register_operand")
> > +   [(match_operand 2 "cc_reg_operand")
> > (match_operand 3 "const0_operand")]))
> >   (clobber (reg:CC CC_REGNUM))])]
> >""
> > -  "emit_insn (gen_sne (operands[0], operands[2]));
> > -   if (GET_CODE (operands[1]) == EQ)
> > - emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> > +  "machine_mode mode = GET_MODE (operands[2]);
> > +   if (TARGET_Z196)
> > + {
> > +   rtx cond, ite;
> > +
> > +   if (GET_CODE (operands[1]) == NE)
> > +cond = gen_rtx_NE (VOIDmode, operands[2], const0_rtx);
> > +   else
> > +cond = gen_rtx_EQ (VOIDmode, operands[2], const0_rtx);
> > +   ite = gen_rtx_IF_THEN_ELSE (SImode, cond, const1_rtx, const0_rtx);
> > +   emit_insn (gen_rtx_SET (operands[0], ite));
> > + }
> > +   else
> > + {
> > +   if (mode != CCZ1mode)
> > +FAIL;
> > +   emit_insn (gen_sne (operands[0], operands[2]));
> > +   if (GET_CODE (operands[1]) == EQ)
> > +emit_insn (gen_xorsi3 (operands[0], operands[0], const1_rtx));
> > + }
> > DONE;")
> 
> >From what I can see in the rest of the patch, none of the CS changes now
> actually *rely* on this change to cstorecc4 ... s390_expand_cs_tdsi only
> calls cstorecc4 on !TARGET_Z196, where the above change is a no-op, and
> in the TARGET_Z196 case it deliberates does *not* use cstorecc4.

You're right.  After all the refactoring, this part of the patch
has become unused.

> Now, in general this improvement to cstorecc4 is of course valuable
> in itself.  But I think at this point it might be better to separate
> this out into an independent patch (and measure its effect separately).

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



patch to fix PR70478

2017-04-07 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478

  The patch was successfully bootstrapped and tested on x86-64/ppc64/arm64.

  Committed as rev. 246764.

Index: ChangeLog
===
--- ChangeLog	(revision 246763)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2017-04-07  Vladimir Makarov  
+
+	PR rtl-optimization/70478
+	* lra-constraints.c (process_alt_operands): Disfavor alternative
+	insn memory operands.
+
 2017-04-07  Jeff Law  
 
 	* config/iq2000/iq2000.c (final_prescan_insn): Do not separate a
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 246763)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2017-04-07  Vladimir Makarov  
+
+	PR rtl-optimization/70478
+	* gcc.target/s390/pr70478.c: New.
+
 2017-04-07  Martin Liska  
 
 	PR target/79889
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 246763)
+++ lra-constraints.c	(working copy)
@@ -2685,6 +2685,21 @@ process_alt_operands (int only_alternati
 		}
 		}
 
+	  /* When we use memory operand, the insn should read the
+		 value from memory and even if we just wrote a value
+		 into the memory it is costly in comparison with an
+		 insn alternative which does not use memory
+		 (e.g. register or immediate operand).  */
+	  if (no_regs_p && offmemok)
+		{
+		  if (lra_dump_file != NULL)
+		fprintf
+		  (lra_dump_file,
+		   "Using memory insn operand %d: reject+=3\n",
+		   nop);
+		  reject += 3;
+		}
+	  
 #ifdef SECONDARY_MEMORY_NEEDED
 	  /* If reload requires moving value through secondary
 		 memory, it will need one more insn at least.  */
Index: testsuite/gcc.target/s390/pr70478.c
===
--- testsuite/gcc.target/s390/pr70478.c	(nonexistent)
+++ testsuite/gcc.target/s390/pr70478.c	(working copy)
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-w -O3 -march=z9-109" } */
+/* { dg-final { scan-assembler-not "stg" } } */
+
+void foo(unsigned int *a, unsigned char *b)
+{
+  *a &= *b;
+}


Re: patch to fix PR70703

2017-04-07 Thread Vladimir Makarov



On 04/06/2017 06:24 AM, Richard Biener wrote:

On Wed, Apr 5, 2017 at 6:18 PM, Vladimir Makarov  wrote:


On 04/05/2017 12:07 PM, Vladimir Makarov wrote:



I'll correct the patch.


Here is the patch I've committed.

Note that in such contexts it's better to just use [u]int64_t.



You are right, Richard.  I've committed the following patch:

Index: ChangeLog
===
--- ChangeLog   (revision 246764)
+++ ChangeLog   (working copy)
@@ -1,5 +1,11 @@
 2017-04-07  Vladimir Makarov  

+   PR rtl-optimization/70703
+   * ira-color.c (update_conflict_hard_regno_costs): Use
+   int64_t instead of HOST_WIDE_INT.
+
+2017-04-07  Vladimir Makarov  
+
PR rtl-optimization/70478
* lra-constraints.c (process_alt_operands): Disfavor alternative
insn memory operands.
Index: ira-color.c
===
--- ira-color.c (revision 246763)
+++ ira-color.c (working copy)
@@ -1522,7 +1522,7 @@ update_conflict_hard_regno_costs (int *c
index = ira_class_hard_reg_index[aclass][hard_regno];
if (index < 0)
  continue;
-   cost = (int) (((HOST_WIDE_INT) conflict_costs [i] * 
mult) / div);

+   cost = (int) (((int64_t) conflict_costs [i] * mult) / div);
if (cost == 0)
  continue;
cont_p = true;



[PATCH] Fix failure to build libgcc for SH ports

2017-04-07 Thread Jeff Law
The SH ports can occasionally fail to build libgcc, particularly in 
parallel builds:


/var/lib/jenkins/jobs/CROSS-SETUP/workspace/gcc/libgcc/unwind-dw2.c:403:10: 
fatal error: md-unwind-support.h: No such file or directory

 #include "md-unwind-support.h"
  ^
compilation terminated.
/var/lib/jenkins/jobs/CROSS-SETUP/workspace/gcc/libgcc/config/sh/t-sh:49: 
recipe for target 'unwind-dw2-Os-4-200.o' failed



The problem is the SH port defines a target to build a special version 
of unwind-dw2, but fails to depend on $(LIBGCC_LINKS) which sets up 
md-unwind-support.h and other files.  So depending on the exact timing 
we may or may not have set up md-unwind-support.h, leading to occasional 
failures building libgcc.


This patch adds $(LIBGCC_LINKS) to the dependency list for the special 
unwind-dw2.o.  That is not sufficient to fix the problem because 
$(LIBGCC_LINKS) is defined after the inclusion of the target makefile 
fragments.


So this patch also swaps inclusion of the target makefile fragment and 
initialization of LIBGCC_LINKS.  (they were already adjacent, just in 
the wrong order).  I did verify that none of the existing target 
makefile fragments append to LIBGCC_LINKS.


I think there's other missing dependencies here.  But haven't done the 
deep analysis to be 100% sure (libgcc-tm.h in particular).




Installing on the trunk.

Jeff



diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 427328d..1dc5469 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-04-07  Jeff Law  
+
+   * Makefile.in: Swap definition of LIBGCC_LINKS and inclusion of
+   target makefile fragment.
+   * config/sh/t-sh (unwind-dw2-Os-4-200.o): Depend on LIBGCC_LINKS.
+
 2017-04-07  Alan Modra  
 
PR target/45053
diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index f71d320..6f58fd4 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -348,8 +348,6 @@ LIBUNWIND =
 SHLIBUNWIND_LINK =
 SHLIBUNWIND_INSTALL =
 
-tmake_file = @tmake_file@
-include $(srcdir)/empty.mk $(tmake_file)
 
 # Create links to files specified in config.host.
 LIBGCC_LINKS = enable-execute-stack.c unwind.h md-unwind-support.h \
@@ -366,6 +364,9 @@ sfp-machine.h: $(srcdir)/config/$(sfp_machine_header)
 gthr-default.h: $(srcdir)/$(thread_header)
-$(LN_S) $< $@
 
+tmake_file = @tmake_file@
+include $(srcdir)/empty.mk $(tmake_file)
+
 # Collect target defines and headers from config.host.
 libgcc_tm_defines = @tm_defines@
 libgcc_tm_file = @tm_file@
diff --git a/libgcc/config/sh/t-sh b/libgcc/config/sh/t-sh
index 46d01a6..10869c4 100644
--- a/libgcc/config/sh/t-sh
+++ b/libgcc/config/sh/t-sh
@@ -45,7 +45,7 @@ sdivsi3_i4i-Os-4-200.o: 
$(srcdir)/config/sh/lib1funcs-Os-4-200.S
$(gcc_compile) -c -DL_sdivsi3_i4i $<
 udivsi3_i4i-Os-4-200.o: $(srcdir)/config/sh/lib1funcs-Os-4-200.S
$(gcc_compile) -c -DL_udivsi3_i4i $<
-unwind-dw2-Os-4-200.o: $(srcdir)/unwind-dw2.c
+unwind-dw2-Os-4-200.o: $(srcdir)/unwind-dw2.c $(LIBGCC_LINKS)
$(gcc_compile) $(LIBGCC2_CFLAGS) $(vis_hide) -fexceptions -Os -c $<
 
 OBJS_Os_4_200=sdivsi3_i4i-Os-4-200.o udivsi3_i4i-Os-4-200.o 
unwind-dw2-Os-4-200.o


Re: [PATCH] have chkp skip flexible member arrays (PR #79986)

2017-04-07 Thread Jason Merrill
On Wed, Mar 29, 2017 at 7:07 PM, Martin Sebor  wrote:
> On 03/21/2017 01:33 PM, Jason Merrill wrote:
>> On Tue, Mar 21, 2017 at 11:08 AM, Martin Sebor  wrote:
>>> On 03/20/2017 10:27 PM, Jason Merrill wrote:
 On Mon, Mar 20, 2017 at 7:58 PM, Martin Sebor  wrote:
> On 03/20/2017 05:51 PM, Jason Merrill wrote:
>> On Mon, Mar 20, 2017 at 7:04 PM, Martin Sebor 
>> wrote:
>>>
>>>
>>>
>>> Attached is a minimal patch to avoid an ICE in CHKP upon
>>> encountering one form of an initializer for a flexible array
>>> member, specifically the empty string:
>>>
>>>   int f ()
>>>   {
>>> struct B { int n; char a[]; };
>>>
>>> return ((struct B){ 1, "" }).a[0];
>>>   }
>>>
>>> Although GCC accepts (and doesn't ICE on) non-empty initializers
>>> for flexible array members, such as
>>>
>>> (struct B){ 1, "123" }
>>
>>
>> How do you mean?  When I compile this with the C front end, I get
>>
>> error: non-static initialization of a flexible array member
>
> I meant that G++ accepts it, doesn't ICE, but emits wrong code.
> (it's consistently rejected by the C front end).  Sorry for not
> being clear about it.

 Ah, OK.  It seems to me that the wrong code bug is worth addressing;
 why does rejecting the code seem risky to you?
>>>
>>> I have a few reasons: First, it's not a regression (I raised
>>> bug 69696 for it in early 2016) so strictly it's out of scope
>>> for this stage.  Second, there are a number of bugs related
>>> to the initialization of flexible array members so the fixes
>>> are probably not going to be contained to a single function
>>> or file.  Third, the flexible member array changes I made in
>>> the past were not trivial, took time to develop, and the two
>>> of us iterated over some of them for weeks.  Despite your
>>> careful review and my extensive testing some of them
>>> introduced regressions that are still being patched up.
>>> Fourth, making a change to reject code this close to a release
>>> runs the risk of breaking code that has successfully compiled
>>> in mass rebuilds and others' tests with the new compiler.
>>> While that could be viewed as a good change for invalid code
>>> that's exercised at run time, it could also break programs
>>> where the bad code is never exercised.
>>
>> Fair enough.  But I think the ICE is preferable to wrong code.
>
> I agree in general, although in this case it seems that ICE is
> more likely to penalize -fcheck-pointer-bounds users without
> benefiting anyone else.
>
> I decided to go ahead and implemented your suggestion just in
> case it was easier and safer than I thought.  The attached patch
> is the result.  I think the changes are actually not as risky as
> I had initially feared but they are still fairly extensive and
> I would hesitate to recommend they be made at this stage.

Agreed.  A safer patch for GCC 7 might be to add your find_flexarray
function and call that from finish_compound_literal.

> Regardless of what happens to this patch, for GCC 8, I'd like
> to look into making the initialization work correctly in all
> contexts (i.e., including auto variables) .  As I said in my
> response to Marek, I expect rejecting it will only lead users
> determined enough to make use of flexible array members for
> local variables to invent error-prone hacks.

I would expect such users of C++ to be vanishingly few; who would be
determined to use a C feature in a way not allowed by C?

Jason


Re: [Patch, Fortran, F03] PR 80046: Explicit interface required: pointer argument

2017-04-07 Thread Janus Weil
ping!

2017-03-29 22:25 GMT+02:00 Janus Weil :
> Hi all,
>
> here is a patch that enhances the diagnostics for procedure-pointer
> assignments, so that procedure-pointer components that need an
> explicit interface are correctly rejected.
>
> Regtests cleanly on x86_64-linux-gnu. Ok for trunk?
>
> Cheers,
> Janus
>
>
> 2017-03-29  Janus Weil  
>
> PR fortran/80046
> * expr.c (gfc_check_pointer_assign): Check if procedure pointer
> components in a pointer assignment need an explicit interface.
>
> 2017-03-29  Janus Weil  
>
> PR fortran/80046
> * gfortran.dg/proc_ptr_comp_48.f90: New test case.


[PATCH, rs6000] Update Power9 scheduling of vector and vector load insns

2017-04-07 Thread Pat Haugen
The following patch changes the method of scheduling vector and vector
load insns. Before it tried to pair up like insns and interleave the
pairs, resulting in something like L1L2V1V2. The preferred scheduling is
now to just interleave the insns, resulting in L1V1L2V2. If interleaving
fails, fall back to pairing like insns.

Bootstrap/regtest on powerpc64le-linux with no new regressions. I also
did a -mcpu=power9 build of CPU2006 with no errors. Ok for trunk and
backport to GCC 6 branch?

-Pat



2017-04-07  Pat Haugen  

* rs6000/rs6000.c (vec_load_pendulum): Rename...
(vec_pairing): ...to this.
(power9_sched_reorder2): Rewrite code for pairing vector/vecload insns.
(rs6000_sched_init): Adjust for name change.
(struct rs6000_sched_context): Likewise.
(rs6000_init_sched_context): Likewise.
(rs6000_set_sched_context): Likewise.

Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 246648)
+++ config/rs6000/rs6000.c	(working copy)
@@ -32862,7 +32862,7 @@ static int load_store_pendulum;
 static int divide_cnt;
 /* The following variable helps pair and alternate vector and vector load
insns during scheduling.  */
-static int vec_load_pendulum;
+static int vec_pairing;
 
 
 /* Power4 load update and store update instructions are cracked into a
@@ -33854,183 +33854,115 @@ power9_sched_reorder2 (rtx_insn **ready,
   /* Last insn was the 2nd divide or not a divide, reset the counter.  */
   divide_cnt = 0;
 
-  /* Power9 can execute 2 vector operations and 2 vector loads in a single
-	 cycle.  So try to pair up and alternate groups of vector and vector
-	 load instructions.
+  /* The best dispatch throughput for vector and vector load insns can be
+	 achieved by interleaving a vector and vector load such that they'll
+	 dispatch to the same superslice. If this pairing cannot be achieved
+	 then it is best to pair vector insns together and vector load insns
+	 together.
 
-	 To aid this formation, a counter is maintained to keep track of
-	 vec/vecload insns issued.  The value of vec_load_pendulum maintains
-	 the current state with the following values:
+	 To aid in this pairing, vec_pairing maintains the current state with
+	 the following values:
 
-	 0  : Initial state, no vec/vecload group has been started.
+	 0  : Initial state, no vecload/vector pairing has been started.
 
-	 -1 : 1 vector load has been issued and another has been found on
-		  the ready list and moved to the end.
-
-	 -2 : 2 vector loads have been issued and a vector operation has
-		  been found and moved to the end of the ready list.
-
-	 -3 : 2 vector loads and a vector insn have been issued and a
-		  vector operation has been found and moved to the end of the
-		  ready list.
-
-	 1  : 1 vector insn has been issued and another has been found and
-		  moved to the end of the ready list.
-
-	 2  : 2 vector insns have been issued and a vector load has been
-		  found and moved to the end of the ready list.
-
-	 3  : 2 vector insns and a vector load have been issued and another
-		  vector load has been found and moved to the end of the ready
+	 1  : A vecload or vector insn has been issued and a candidate for
+		  for pairing has been found and moved to the end of the ready
 		  list.  */
   if (type == TYPE_VECLOAD)
 	{
 	  /* Issued a vecload.  */
-	  if (vec_load_pendulum == 0)
+	  if (vec_pairing == 0)
 	{
-	  /* We issued a single vecload, look for another and move it to
-		 the end of the ready list so it will be scheduled next.
-		 Set pendulum if found.  */
+	  int vecload_pos = -1;
+	  /* We issued a single vecload, look for a vector insn to pair it
+		 with.  If one isn't found, try to pair another vecload.  */
 	  pos = lastpos;
 	  while (pos >= 0)
 		{
-		  if (recog_memoized (ready[pos]) >= 0
-		  && get_attr_type (ready[pos]) == TYPE_VECLOAD)
+		  if (recog_memoized (ready[pos]) >= 0)
 		{
-		  tmp = ready[pos];
-		  for (i = pos; i < lastpos; i++)
-			ready[i] = ready[i + 1];
-		  ready[lastpos] = tmp;
-		  vec_load_pendulum = -1;
-		  return cached_can_issue_more;
+		  if (is_power9_pairable_vec_type (get_attr_type
+		   (ready[pos])))
+			{
+			  /* Found a vector insn to pair with, move it to the
+			 end of the ready list so it is scheduled next.  */
+			  tmp = ready[pos];
+			  for (i = pos; i < lastpos; i++)
+			ready[i] = ready[i + 1];
+			  ready[lastpos] = tmp;
+			  vec_pairing = 1;
+			  return cached_can_issue_more;
+			}
+		  else if (get_attr_type (ready[pos]) == TYPE_VECLOAD
+			   && vecload_pos == -1)
+			/* Remember position of first vecload seen.  */
+			vecload_pos = pos;
 		}
 		  pos--;
 		}
-	}
-	  else if (vec_load_pendulum == -1)
-	{
-	  /* This is the second vecload we've issued, search the ready
-	 

Re: C++ PATCH for c++/80095, ICE with this pointer in NSDMI

2017-04-07 Thread Jason Merrill
OK, thanks.

On Wed, Apr 5, 2017 at 5:49 AM, Marek Polacek  wrote:
> Ping.
>
> On Wed, Mar 29, 2017 at 11:32:39PM +0200, Marek Polacek wrote:
>> On Wed, Mar 29, 2017 at 02:56:51PM -0400, Jason Merrill wrote:
>> > On Wed, Mar 29, 2017 at 12:38 PM, Marek Polacek  wrote:
>> > > Here we have a reference initialization with NSDMI and *this.  We are 
>> > > crashing
>> > > because a PLACEHOLDER_EXPR crept into the gimplifier.
>> > >
>> > > This happens since r218653 where set_up_extended_ref_temp was changed to
>> > > use split_nonconstant_init.  As a consequence, cp_gimplify_init_expr 
>> > > might
>> > > now be receiving
>> > >
>> > >   D.2051.p = (void *) &
>> > >
>> > > instead of
>> > >
>> > >   D.2051 = {.p = (void *) &}
>> > >
>> > > where the RHS was a CONSTRUCTOR.  It no longer is, so 
>> > > replace_placeholders is
>> > > not called anymore.  It occurred to me that we should use the same check 
>> > > as in
>> > > store_init_value (i.e. check that the object to be used in the 
>> > > substitution is
>> > > a class), but given what split_nonconstant_init might produce, handle
>> > > COMPONENT_REFs specially.
>> > >
>> > > Bootstrapped/regtested on x86_64-linux, ok for trunk and 6?
>> > >
>> > > 2017-03-29  Marek Polacek  
>> > >
>> > > PR c++/80095 - ICE with this pointer in NSDMI.
>> > > * cp-gimplify.c (cp_gimplify_init_expr): Call 
>> > > replace_placeholders
>> > > when TO is a class.
>> > >
>> > > * g++.dg/cpp1y/nsdmi-aggr8.C: New test.
>> > >
>> > > diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
>> > > index 354ae1a..e530daf 100644
>> > > --- gcc/cp/cp-gimplify.c
>> > > +++ gcc/cp/cp-gimplify.c
>> > > @@ -496,7 +496,16 @@ cp_gimplify_init_expr (tree *expr_p)
>> > > TREE_TYPE (from) = void_type_node;
>> > > }
>> > >
>> > > -  if (cxx_dialect >= cxx14 && TREE_CODE (sub) == CONSTRUCTOR)
>> > > +  /* split_nonconstant_init might've produced something like
>> > > +D.2051.p = (void *) &
>> > > +in which case we want to substitute the placeholder with
>> > > +D.2051.  */
>> > > +  tree op0 = to;
>> > > +  while (TREE_CODE (op0) == COMPONENT_REF)
>> > > +   op0 = TREE_OPERAND (op0, 0);
>> > > +  tree type = TREE_TYPE (op0);
>> > > +
>> > > +  if (cxx_dialect >= cxx14 && CLASS_TYPE_P (strip_array_types 
>> > > (type)))
>> >
>> > How about doing this checking in replace_placeholders, instead?  That
>> > is, if the object isn't a (member of a) class, just return.
>>
>> Sure.  I also moved the C++14 check into replace_placeholders, although maybe
>> I shouldn't have.
>>
>> Bootstrapped/regtested on x86_64-linux, ok for trunk and 6?
>>
>> 2017-03-29  Marek Polacek  
>>
>>   PR c++/80095
>>   * call.c (build_over_call): Don't check cxx_dialect.
>>   * cp-gimplify.c (cp_gimplify_init_expr): Don't check cxx_dialect nor
>>   whether SUB is a CONSTRUCTOR.
>>   * init.c (build_new_1): Don't check cxx_dialect.
>>   * tree.c (replace_placeholders): Add a function comment.  Return if
>>   not in C++14, or if the object isn't a (member of a) class.
>>   * typeck2.c (store_init_value): Don't check cxx_dialect nor whether
>>   TYPE is CLASS_TYPE_P.
>>
>>   * g++.dg/cpp1y/nsdmi-aggr8.C: New test.
>>
>> diff --git gcc/cp/call.c gcc/cp/call.c
>> index 803fbd4..c15b8e4 100644
>> --- gcc/cp/call.c
>> +++ gcc/cp/call.c
>> @@ -8047,9 +8047,8 @@ build_over_call (struct z_candidate *cand, int flags, 
>> tsubst_flags_t complain)
>>   {
>> arg = cp_build_indirect_ref (arg, RO_NULL, complain);
>> val = build2 (MODIFY_EXPR, TREE_TYPE (to), to, arg);
>> -   if (cxx_dialect >= cxx14)
>> - /* Handle NSDMI that refer to the object being initialized.  */
>> - replace_placeholders (arg, to);
>> +   /* Handle NSDMI that refer to the object being initialized.  */
>> +   replace_placeholders (arg, to);
>>   }
>>else
>>   {
>> diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
>> index 354ae1a..3f91c35 100644
>> --- gcc/cp/cp-gimplify.c
>> +++ gcc/cp/cp-gimplify.c
>> @@ -496,9 +496,8 @@ cp_gimplify_init_expr (tree *expr_p)
>>   TREE_TYPE (from) = void_type_node;
>>   }
>>
>> -  if (cxx_dialect >= cxx14 && TREE_CODE (sub) == CONSTRUCTOR)
>> - /* Handle aggregate NSDMI.  */
>> - replace_placeholders (sub, to);
>> +  /* Handle aggregate NSDMI.  */
>> +  replace_placeholders (sub, to);
>>
>>if (t == sub)
>>   break;
>> diff --git gcc/cp/init.c gcc/cp/init.c
>> index 7732795..6a70955 100644
>> --- gcc/cp/init.c
>> +++ gcc/cp/init.c
>> @@ -3373,8 +3373,7 @@ build_new_1 (vec **placement, tree type, 
>> tree nelts,
>>object being initialized, replace them now and don't try to
>>preevaluate.  */
>> bool had_placeholder = false;
>> -   if (cxx_dialect >= cxx14
>> -   && !processing_template_decl
>> +   if (!processing_template_decl
>> && TR

Re: [PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR sanitizer/80350).

2017-04-07 Thread Jakub Jelinek
On Fri, Apr 07, 2017 at 04:26:50PM +0200, Martin Liška wrote:
> Hello.
> 
> Similar to what was done in Marek's r202113, when op1 is a SAVE_EXPR it must
> be evaluated before condition, in order to be able to deliver the operand
> to real shifting. And not just to a BB where ubsan report function is called.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> Apart from that make check RUNTESTFLAGS="ubsan.exp" works on x86_64-linux-gnu.
> 
> Ready to be installed?
> Martin

> >From 2ff2e17d82ee85b09cb5f83afbee70f8b1a84f4f Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Fri, 7 Apr 2017 12:21:44 +0200
> Subject: [PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR
>  sanitizer/80350).
> 
> gcc/c-family/ChangeLog:
> 
> 2017-04-07  Martin Liska  
> 
>   PR sanitizer/80350
>   * c-ubsan.c (ubsan_instrument_shift): Evaluate RHS before
>   doing an UBSAN check.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-04-07  Martin Liska  
> 
>   PR sanitizer/80350
>   * c-c++-common/ubsan/pr80350.c: New test.
> ---
>  gcc/c-family/c-ubsan.c |  4 +++-
>  gcc/testsuite/c-c++-common/ubsan/pr80350.c | 17 +
>  2 files changed, 20 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/pr80350.c
> 
> diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
> index 91bdef88320..ef45abdd19e 100644
> --- a/gcc/c-family/c-ubsan.c
> +++ b/gcc/c-family/c-ubsan.c
> @@ -171,7 +171,9 @@ ubsan_instrument_shift (location_t loc, enum tree_code 
> code,
>  
>/* In case we have a SAVE_EXPR in a conditional context, we need to
>   make sure it gets evaluated before the condition.  */
> -  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
> +  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t),
> +fold_build2 (COMPOUND_EXPR, TREE_TYPE (op1),
> + unshare_expr (op0), unshare_expr (op1)), t);

For consistency with ubsan_instrument_division and better readability,
can't you:
  /* In case we have a SAVE_EXPR in a conditional context, we need to
 make sure it gets evaluated before the condition.  */
  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op1), t);

Ok with that change.

Jakub


Re: [PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR sanitizer/80350).

2017-04-07 Thread Marek Polacek
On Fri, Apr 07, 2017 at 07:04:08PM +0200, Jakub Jelinek wrote:
> On Fri, Apr 07, 2017 at 04:26:50PM +0200, Martin Liška wrote:
> > Hello.
> > 
> > Similar to what was done in Marek's r202113, when op1 is a SAVE_EXPR it must
> > be evaluated before condition, in order to be able to deliver the operand
> > to real shifting. And not just to a BB where ubsan report function is 
> > called.
> > 
> > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> > Apart from that make check RUNTESTFLAGS="ubsan.exp" works on 
> > x86_64-linux-gnu.
> > 
> > Ready to be installed?
> > Martin
> 
> > >From 2ff2e17d82ee85b09cb5f83afbee70f8b1a84f4f Mon Sep 17 00:00:00 2001
> > From: marxin 
> > Date: Fri, 7 Apr 2017 12:21:44 +0200
> > Subject: [PATCH] Evaluate a SAVE_EXPR before an UBSAN check (PR
> >  sanitizer/80350).
> > 
> > gcc/c-family/ChangeLog:
> > 
> > 2017-04-07  Martin Liska  
> > 
> > PR sanitizer/80350
> > * c-ubsan.c (ubsan_instrument_shift): Evaluate RHS before
> > doing an UBSAN check.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 2017-04-07  Martin Liska  
> > 
> > PR sanitizer/80350
> > * c-c++-common/ubsan/pr80350.c: New test.
> > ---
> >  gcc/c-family/c-ubsan.c |  4 +++-
> >  gcc/testsuite/c-c++-common/ubsan/pr80350.c | 17 +
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/c-c++-common/ubsan/pr80350.c
> > 
> > diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
> > index 91bdef88320..ef45abdd19e 100644
> > --- a/gcc/c-family/c-ubsan.c
> > +++ b/gcc/c-family/c-ubsan.c
> > @@ -171,7 +171,9 @@ ubsan_instrument_shift (location_t loc, enum tree_code 
> > code,
> >  
> >/* In case we have a SAVE_EXPR in a conditional context, we need to
> >   make sure it gets evaluated before the condition.  */
> > -  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
> > +  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t),
> > +  fold_build2 (COMPOUND_EXPR, TREE_TYPE (op1),
> > +   unshare_expr (op0), unshare_expr (op1)), t);
> 
> For consistency with ubsan_instrument_division and better readability,
> can't you:
>   /* In case we have a SAVE_EXPR in a conditional context, we need to
>  make sure it gets evaluated before the condition.  */
>   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
>   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op1), t);

Yeah, I don't have the authority to approve the patch, but I was gonna suggest
this change, too.

Marek


Re: [PATCH] S/390: Optimize atomic_compare_exchange and atomic_compare builtins.

2017-04-07 Thread Ulrich Weigand
Dominik Vogt wrote:
> On Fri, Apr 07, 2017 at 04:34:44PM +0200, Ulrich Weigand wrote:
> > > +; Peephole to combine a load-and-test from volatile memory which combine 
> > > does
> > > +; not do.
> > > +(define_peephole2
> > > +  [(set (match_operand:GPR 0 "register_operand")
> > > + (match_operand:GPR 2 "memory_operand"))
> > > +   (set (reg CC_REGNUM)
> > > + (compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
> > > +  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
> > > +   && GENERAL_REG_P (operands[0])
> > > +   && satisfies_constraint_T (operands[2])"
> > > +  [(parallel
> > > +[(set (reg:CCS CC_REGNUM)
> > > +   (compare:CCS (match_dup 2) (match_dup 1)))
> > > + (set (match_dup 0) (match_dup 2))])])
> > 
> > Still wondering why this is necessary.
> 
> It's necessary vecause Combine refuses to match anything that
> contains a volatile memory reference, using a global flag for
> Recog.

So is this specifically to match the pre-test load emitted here?

+  emit_move_insn (output, mem);
+  emit_insn (gen_rtx_SET (cc, gen_rtx_COMPARE (CCZmode, output, cmp)));

If so, since you already know that this should always map to a
LOAD AND TEST, could simply just emit the LT pattern here,
instead of relying on combine to do it ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



C++ PATCH for sanitizer/80348, ICE with -fsanitize=integer-divide-by-zero

2017-04-07 Thread Marek Polacek
This ICEs in ubsan_instrument_division on the assert that checks whether both
operands of the division have the same type.  Well, here they didn't, because
in cp_build_binary_op we first converted both operands to result_type but then
fold_non_dependent_expr changed the type of op0, so we need to catch this case
before calling ubsan_instrument_division.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-04-07  Marek Polacek  

PR sanitizer/80348
* typeck.c (cp_build_binary_op): Convert COP[01] to ORIG_TYPE.

* g++.dg/ubsan/div-by-zero-2.C: New test.

diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 79391c0..65a3435 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -5218,10 +5218,12 @@ cp_build_binary_op (location_t location,
 original result_type.  */
  tree cop0 = op0;
  tree cop1 = op1;
- if (orig_type != NULL && result_type != orig_type)
+ if (orig_type != NULL_TREE)
{
- cop0 = cp_convert (orig_type, op0, complain);
- cop1 = cp_convert (orig_type, op1, complain);
+ if (TREE_TYPE (cop0) != orig_type)
+   cop0 = cp_convert (orig_type, op0, complain);
+ if (TREE_TYPE (cop1) != orig_type)
+   cop1 = cp_convert (orig_type, op1, complain);
}
  instrument_expr = ubsan_instrument_division (location, cop0, cop1);
}
diff --git gcc/testsuite/g++.dg/ubsan/div-by-zero-2.C 
gcc/testsuite/g++.dg/ubsan/div-by-zero-2.C
index e69de29..d500ae6 100644
--- gcc/testsuite/g++.dg/ubsan/div-by-zero-2.C
+++ gcc/testsuite/g++.dg/ubsan/div-by-zero-2.C
@@ -0,0 +1,10 @@
+// PR sanitizer/80348
+// { dg-do compile }
+// { dg-options "-fsanitize=integer-divide-by-zero" }
+
+void
+foo ()
+{
+  if (0)
+unsigned ((0 != 60806) > (0 != 0)) / 0; // { dg-warning "division by zero" 
}
+}

Marek


Re: [PATCH, rs6000] Update Power9 scheduling of vector and vector load insns

2017-04-07 Thread Segher Boessenkool
On Fri, Apr 07, 2017 at 11:58:15AM -0500, Pat Haugen wrote:
> The following patch changes the method of scheduling vector and vector
> load insns. Before it tried to pair up like insns and interleave the
> pairs, resulting in something like L1L2V1V2. The preferred scheduling is
> now to just interleave the insns, resulting in L1V1L2V2. If interleaving
> fails, fall back to pairing like insns.

> +  1  : A vecload or vector insn has been issued and a candidate for
> +   for pairing has been found and moved to the end of the ready

Duplicate word ("for for").

> +   if (is_power9_pairable_vec_type (get_attr_type
> +(ready[pos])))

A good way to battle long lines is to introduce another local var.

Okay for trunk and branch (with the duplicate word fixed, and do
whatever you think best with the long line).  Thanks,


Segher


Re: C++ PATCH for sanitizer/80348, ICE with -fsanitize=integer-divide-by-zero

2017-04-07 Thread Jakub Jelinek
On Fri, Apr 07, 2017 at 07:23:17PM +0200, Marek Polacek wrote:
> This ICEs in ubsan_instrument_division on the assert that checks whether both
> operands of the division have the same type.  Well, here they didn't, because
> in cp_build_binary_op we first converted both operands to result_type but then
> fold_non_dependent_expr changed the type of op0, so we need to catch this case
> before calling ubsan_instrument_division.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-04-07  Marek Polacek  
> 
>   PR sanitizer/80348
>   * typeck.c (cp_build_binary_op): Convert COP[01] to ORIG_TYPE.
> 
>   * g++.dg/ubsan/div-by-zero-2.C: New test.

Ok, thanks.

Jakub


Re: [PR59319] output friends in debug info

2017-04-07 Thread Alexandre Oliva
On Mar 21, 2017, Alexandre Oliva  wrote:
> On Jan 27, 2017, Alexandre Oliva  wrote:
>> On Oct 19, 2016, Alexandre Oliva  wrote:
>>> On Sep 23, 2016, Alexandre Oliva  wrote:
 On Aug 30, 2016, Alexandre Oliva  wrote:
> Handling non-template friends is kind of easy, [...]
 Ping?
>>> Ping?  (conflicts resolved, patch refreshed and retested)
>> Ping?  (trivial conflicts resolved)
> Ping?  https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02112.html
Ping?

>> Handling non-template friends is kind of easy, but it required a bit
>> of infrastructure in dwarf2out to avoid (i) forcing debug info for
>> unused types or functions: DW_TAG_friend DIEs are only emitted if
>> their DW_AT_friend DIE is emitted, and (ii) creating DIEs for such
>> types or functions just to have them discarded at the end.  To this
>> end, I introduced a list (vec, actually) of types with friends,
>> processed at the end of the translation unit, and a list of
>> DW_TAG_friend DIEs that, when we're pruning unused types, reference
>> DIEs that are still not known to be used, revisited after we finish
>> deciding all other DIEs, so that we prune DIEs that would have
>> referenced pruned types or functions.

>> Handling template friends turned out to be trickier: there's no
>> representation in DWARF for templates.  I decided to give debuggers as
>> much information as possible, enumerating all specializations of
>> friend templates and outputting DW_TAG_friend DIEs referencing them as
>> well.  I considered marking those as DW_AT_artificial, to indicate
>> they're not explicitly stated in the source code, but in the end we
>> decided that was not useful.  The greatest challenge was to enumerate
>> all specializations of a template.  It looked trivial at first, given
>> DECL_TEMPLATE_INSTANTIATIONS, but it won't list specializations of
>> class-scoped functions and of nested templates.  For other templates,
>> I ended up writing code to look for specializations in the hashtables
>> of decl or type specializations.  That's not exactly efficient, but it
>> gets the job done.


>> for gcc/ChangeLog

>> PR debug/59319
>> * dwarf2out.c (class_types_with_friends): New.
>> (gen_friend_tags_for_type, gen_friend_tags): New.
>> (gen_member_die): Record class types with friends.
>> (deferred_marks): New.
>> (prune_unused_types_defer_undecided_mark_p): New.
>> (prune_unused_types_defer_mark): New.
>> (prune_unused_types_deferred_walk): New.
>> (prune_unused_types_walk): Defer DW_TAG_friend.
>> (prune_unused_types): Check deferred marks is empty on entry,
>> empty it after processing.
>> (dwarf2out_finish): Generate friend tags.
>> (dwarf2out_early_finish): Likewise.
>> * langhooks-def.h (LANG_HOOKS_GET_FRIENDS): New.
>> (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add it.
>> * langhooks.h (lang_hooks_for_types): Add get_friends.
>> * hooks.c (hook_tree_const_tree_int_null): New.
>> * hooks.h (hook_tree_const_tree_int_null): Declare.

>> for gcc/cp/ChangeLog

>> PR debug/59319
>> * cp-objcp-common.c (cp_get_friends): New.
>> * cp-objcp-common.h (cp_get_friends): Declare.
>> (LANG_HOOKS_GET_FRIENDS): Override.
>> * cp-tree.h (enumerate_friend_specializations): Declare.
>> * pt.c (optimize_friend_specialization_lookup_p): New.
>> (retrieve_friend_specialization): New.
>> (enumerate_friend_specializations): New.
>> (register_specialization): Update DECL_TEMPLATE_INSTANTIATIONS
>> for functions, even after definition, if we are emitting debug
>> info.

>> for gcc/testsuite/ChangeLog

>> PR debug/59319
>> * g++.dg/debug/dwarf2/friend-1.C: New.
>> * g++.dg/debug/dwarf2/friend-2.C: New.
>> * g++.dg/debug/dwarf2/friend-3.C: New.
>> * g++.dg/debug/dwarf2/friend-4.C: New.
>> * g++.dg/debug/dwarf2/friend-5.C: New.
>> * g++.dg/debug/dwarf2/friend-6.C: New.
>> * g++.dg/debug/dwarf2/friend-7.C: New.
>> * g++.dg/debug/dwarf2/friend-8.C: New.
>> * g++.dg/debug/dwarf2/friend-9.C: New.
>> * g++.dg/debug/dwarf2/friend-10.C: New.
>> * g++.dg/debug/dwarf2/friend-11.C: New.
>> * g++.dg/debug/dwarf2/friend-12.C: New.
>> * g++.dg/debug/dwarf2/friend-13.C: New.
>> * g++.dg/debug/dwarf2/friend-14.C: New.
>> * g++.dg/debug/dwarf2/friend-15.C: New.
>> * g++.dg/debug/dwarf2/friend-16.C: New.
>> * g++.dg/debug/dwarf2/friend-17.C: New.
>> * g++.dg/debug/dwarf2/friend-18.C: New.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [libcp1] handle anon aggregates linkage-named by typedefs

2017-04-07 Thread Alexandre Oliva
Ping?  https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01143.html

On Mar 21, 2017, Alexandre Oliva  wrote:

> for  libcc1/ChangeLog
>   * libcp1plugin.cc (plugin_build_decl): Propagate typedef name to
>   anonymous aggregate target type.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[patch] aarch64 FreeBSD wchar_t

2017-04-07 Thread Andreas Tobler

Hi all,

I'm going to commit this patch to all active branches (5,6,7) in the 
next hours.


A few failures less in the testsuite.

Thanks,
Andreas

2017-04-07  Andreas Tobler  

* config/aarch64/aarch64-freebsd.h: Define WCHAR_TYPE.

===
--- aarch64-freebsd.h   (revision 246772)
+++ aarch64-freebsd.h   (working copy)
@@ -91,4 +91,7 @@
 #undef TARGET_BINDS_LOCAL_P
 #define TARGET_BINDS_LOCAL_P default_binds_local_p_2

+#undef  WCHAR_TYPE
+#define WCHAR_TYPE  "unsigned int"
+
 #endif  /* GCC_AARCH64_FREEBSD_H */


Re: C++ PATCH to fix ICE in replace_placeholders_r (PR c++/79937)

2017-04-07 Thread Jason Merrill
On Fri, Mar 24, 2017 at 12:22 PM, Marek Polacek  wrote:
> On Thu, Mar 23, 2017 at 05:09:58PM -0400, Jason Merrill wrote:
>> On Thu, Mar 23, 2017 at 4:34 PM, Marek Polacek  wrote:
>> > On Tue, Mar 14, 2017 at 02:34:30PM -0400, Jason Merrill wrote:
>> >> On Tue, Mar 14, 2017 at 2:33 PM, Jason Merrill  wrote:
>> >> > On Tue, Mar 7, 2017 at 12:10 PM, Marek Polacek  
>> >> > wrote:
>> >> >> In this testcase we have
>> >> >> C c = bar (X{1});
>> >> >> which store_init_value sees as
>> >> >> c = TARGET_EXPR > >> >> .n=(&)->i}>)>
>> >> >> i.e. we're initializing "c" with a TARGET_EXPR.  We call 
>> >> >> replace_placeholders
>> >> >> that walks the whole tree to substitute the placeholders.  Eventually 
>> >> >> we find
>> >> >> the nested  but that's for another object, 
>> >> >> so we
>> >> >> crash.  Seems that we shouldn't have stepped into the second 
>> >> >> TARGET_EXPR at
>> >> >> all; it has nothing to with "c", it's bar's argument.
>> >> >>
>> >> >> It occurred to me that we shouldn't step into CALL_EXPRs and leave the
>> >> >> placeholders in function arguments to cp_gimplify_init_expr which calls
>> >> >> replace_placeholders for constructors.  Not sure if it's enough to 
>> >> >> handle
>> >> >> CALL_EXPRs like this, anything else?
>> >> >
>> >> > Hmm, we might have a DMI containing a call with an argument referring
>> >> > to *this, i.e.
>> >> >
>> >> > struct A
>> >> > {
>> >> >   int i;
>> >> >   int j = frob (this->i);
>> >> > };
>> >> >
>> >> > The TARGET_EXPR seems like a more likely barrier, but even there we
>> >> > could have something like
>> >> >
>> >> > struct A { int i; };
>> >> > struct B
>> >> > {
>> >> >   int i;
>> >> >   A a = A{this->i};
>> >> > };
>> >> >
>> >> > I think we need replace_placeholders to keep a stack of objects, so
>> >> > that when we see a TARGET_EXPR we add it to the stack and therefore
>> >> > can properly replace a PLACEHOLDER_EXPR of its type.
>> >>
>> >> Or actually, avoid replacing such a PLACEHOLDER_EXPR, but rather leave
>> >> it for later when we lower the TARGET_EXPR.
>> >
>> > Sorry, I don't really follow.  I have a patch that puts TARGET_EXPRs on
>> > a stack, but I don't know how that helps.  E.g. with nsdmi-aggr3.C
>> > we have
>> > B b = TARGET_EXPR > > &>}>
>> > so when we get to that PLACEHOLDER_EXPR, on the stack there's
>> > TARGET_EXPR with type struct A
>> > TARGET_EXPR with type struct B
>> > so the type of the PLACEHOLDER_EXPR doesn't match the type of the current
>> > TARGET_EXPR, but we still want to replace it in this case.
>> >
>> > So -- could you expand a bit on what you had in mind, please?
>>
>> So then when we see a placeholder, we walk the stack to find the
>> object of the matching type.
>>
>> But if the object we find was collected from walking through a
>> TARGET_EXPR, we should leave the PLACEHOLDER_EXPR alone, so that it
>> can be replaced later with the actual target of the initialization.
>
> Unfortunately, I still don't understand; guess I'll have to drop this PR.
>
> With this we put TARGET_EXPRs on a stack, and then when we find a
> PLACEHOLDER_EXPR we walk the stack to find a TARGET_EXPR of the same type as
> the PLACEHOLDER_EXPR.  There are three simplified examples I've been playing
> with:
>
>   B b = T_E >}>
>
>   - here we should replace the P_E; on the stack there are two
> TARGET_EXPRs of types B and A
>
>   C c = T_E >)>
>
>   - here we shouldn't replace the P_E; on the stack there are two
> TARGET_EXPRs of types X and C
>
>   B b = T_E }}>
>
>   - here we should replace the P_E; on the stack there's one TARGET_EXPR
> of type B
>
> In each case we find a TARGET_EXPR of the type of the PLACEHOLDER_EXPR, but I
> don't see how to decide which PLACEHOLDER_EXPR we should let slide.  Sorry for
> being dense...

I was thinking that we want to replace the type of the first entry in
the stack (B, C, B respectively), and leave others alone.

Jason


[PATCH][PR target/80358][7 regression] Fix boundary check error in expand_block_compare

2017-04-07 Thread Aaron Sawdey
Turns out we get passed const -1 for the length arg from this code.
ROUND_UP adds load_mode_size to that resulting in a small positive
number, hilarity ensues. Fixed by computing a sensible limit and using
IN_RANGE instead, which won't overflow in this way.

OK for trunk if bootstrap/regtest in progress passes?

2017-04-07  Aaron Sawdey  

PR target/80358
* config/rs6000/rs6000.c (expand_block_compare): Fix boundary check.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 246771)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -19672,8 +19672,9 @@
   unsigned int load_mode_size = GET_MODE_SIZE (load_mode);
 
   /* We don't want to generate too much code.  */
-  if (ROUND_UP (bytes, load_mode_size) / load_mode_size
-  > (unsigned HOST_WIDE_INT) rs6000_block_compare_inline_limit)
+  unsigned HOST_WIDE_INT max_bytes =
+load_mode_size * (unsigned HOST_WIDE_INT) 
rs6000_block_compare_inline_limit;
+  if (!IN_RANGE (bytes, 1, max_bytes))
 return false;
 
   bool generate_6432_conversion = false;
-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [PATCH v2,rs6000] PR80108: Fix ICE with cross compiler

2017-04-07 Thread Segher Boessenkool
Hi Kelvin,

Sorry this has fallen by the wayside.  I did mark it "to do", and then
promptly buried it.  Whoops.

On Thu, Apr 06, 2017 at 08:36:00AM -0600, Kelvin Nilsen wrote:

> +dg-test $gfortran_test_path/[lindex $args 1] ""
> $gfortran_aux_module_flags

Your mailer messed up this line (and some later).

Okay for trunk.  Thanks!


Segher


[C++ PATCH] New warning for extra semicolons after in-class function definitions

2017-04-07 Thread Volker Reichelt
Hi,

with the following patch I suggest to add a diagnostic for extra
semicolons after in-class member function definitions:

  struct A
  {
A() {};
void foo() {};
friend void bar() {};
  };

Although they are allowed in the C++ standard, people (including me)
often like to get rid of them for stylistic/consistency reasons.
In fact clang has a warning -Wextra-semi for this.

Also in GCC (almost exactly 10 years ago) there was a patch
https://gcc.gnu.org/ml/gcc-cvs/2007-03/msg00841.html
to issue a pedwarn (which had to be reverted as GCC would reject valid
code because of the pedwarn).

Instead of using pewarn the patch below adds a new warning (named like
clang's) to warn about these redundant semicolons.
Btw, clang's warning message "extra ';' after member function definition"
is slightly incorrect because it is also emitted for friend functions
which are not member-functions. That's why I suggest a different wording:

  Wextra-semi.C:3:9: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 A() {};
   ^
  Wextra-semi.C:4:16: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 void foo() {};
  ^
  Wextra-semi.C:5:23: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 friend void bar() {};
 ^

Bootstrapped and regtested on x86_64-pc-linux-gnu.

OK for stage 1, once GCC 7 has branched?

Regards,
Volker


2017-04-07  Volker Reichelt  

* c.opt (Wextra-semi): New C++ warning flag.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 246752)
+++ gcc/c-family/c.opt  (working copy)
@@ -504,6 +504,10 @@
 C ObjC C++ ObjC++ Warning
 ; in common.opt
 
+Wextra-semi
+C++ Var(warn_extra_semi) Warning
+Warn about semicolon after in-class function definition.
+
 Wfloat-conversion
 C ObjC C++ ObjC++ Var(warn_float_conversion) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wconversion)
 Warn for implicit type conversions that cause loss of floating point precision.

2017-04-07  Volker Reichelt  

* parser.c (cp_parser_member_declaration): Add warning for
extra semicolon after in-class function definition.

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c (revision 246752)
+++ gcc/cp/parser.c (working copy)
@@ -23386,7 +23386,11 @@
  token = cp_lexer_peek_token (parser->lexer);
  /* If the next token is a semicolon, consume it.  */
  if (token->type == CPP_SEMICOLON)
-   cp_lexer_consume_token (parser->lexer);
+   {
+ cp_lexer_consume_token (parser->lexer);
+ warning (OPT_Wextra_semi, "extra %<;%> "
+  "after in-class function definition");
+   }
  goto out;
}
  else

2017-04-07  Volker Reichelt  

* g++.dg/warn/Wextra-semi.C: New test.

Index: gcc/testsuite/g++.dg/warn/Wextra-semi.C
===
--- gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
+++ gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
@@ -0,0 +1,8 @@
+// { dg-options "-Wextra-semi" }
+
+struct A
+{
+  A() {};  // { dg-warning "after in-class function definition" }
+  void foo() {};   // { dg-warning "after in-class function definition" }
+  friend void bar() {};// { dg-warning "after in-class function 
definition" }
+};
===



[C++ PATCH] New warning for extra semicolons after in-class function definitions

2017-04-07 Thread Volker Reichelt
Hi,

with the following patch I suggest to add a diagnostic for extra
semicolons after in-class member function definitions:

  struct A
  {
A() {};
void foo() {};
friend void bar() {};
  };

Although they are allowed in the C++ standard, people (including me)
often like to get rid of them for stylistic/consistency reasons.
In fact clang has a warning -Wextra-semi for this.

Also in GCC (almost exactly 10 years ago) there was a patch
https://gcc.gnu.org/ml/gcc-cvs/2007-03/msg00841.html
to issue a pedwarn (which had to be reverted as GCC would reject valid
code because of the pedwarn).

Instead of using pewarn the patch below adds a new warning (named like
clang's) to warn about these redundant semicolons.
Btw, clang's warning message "extra ';' after member function definition"
is slightly incorrect because it is also emitted for friend functions
which are not member-functions. That's why I suggest a different wording:

  Wextra-semi.C:3:9: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 A() {};
   ^
  Wextra-semi.C:4:16: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 void foo() {};
  ^
  Wextra-semi.C:5:23: warning: extra ';' after in-class function definition 
[-Wextra-semi]
 friend void bar() {};
 ^

Bootstrapped and regtested on x86_64-pc-linux-gnu.

OK for stage 1, once GCC 7 has branched?

Regards,
Volker


2017-04-07  Volker Reichelt  

* c.opt (Wextra-semi): New C++ warning flag.

Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 246752)
+++ gcc/c-family/c.opt  (working copy)
@@ -504,6 +504,10 @@
 C ObjC C++ ObjC++ Warning
 ; in common.opt
 
+Wextra-semi
+C++ Var(warn_extra_semi) Warning
+Warn about semicolon after in-class function definition.
+
 Wfloat-conversion
 C ObjC C++ ObjC++ Var(warn_float_conversion) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wconversion)
 Warn for implicit type conversions that cause loss of floating point precision.

2017-04-07  Volker Reichelt  

* parser.c (cp_parser_member_declaration): Add warning for
extra semicolon after in-class function definition.

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c (revision 246752)
+++ gcc/cp/parser.c (working copy)
@@ -23386,7 +23386,11 @@
  token = cp_lexer_peek_token (parser->lexer);
  /* If the next token is a semicolon, consume it.  */
  if (token->type == CPP_SEMICOLON)
-   cp_lexer_consume_token (parser->lexer);
+   {
+ cp_lexer_consume_token (parser->lexer);
+ warning (OPT_Wextra_semi, "extra %<;%> "
+  "after in-class function definition");
+   }
  goto out;
}
  else

2017-04-07  Volker Reichelt  

* g++.dg/warn/Wextra-semi.C: New test.

Index: gcc/testsuite/g++.dg/warn/Wextra-semi.C
===
--- gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
+++ gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
@@ -0,0 +1,8 @@
+// { dg-options "-Wextra-semi" }
+
+struct A
+{
+  A() {};  // { dg-warning "after in-class function definition" }
+  void foo() {};   // { dg-warning "after in-class function definition" }
+  friend void bar() {};// { dg-warning "after in-class function 
definition" }
+};
===



Re: [PATCH] Add a new type attribute always_alias (PR79671)

2017-04-07 Thread Jason Merrill
On Fri, Apr 7, 2017 at 11:32 AM, Bernd Edlinger
 wrote:
> On 04/07/17 17:10, Richard Biener wrote:
>> On April 7, 2017 3:37:30 PM GMT+02:00, Bernd Edlinger 
>>  wrote:
>>> On 04/07/17 08:54, Richard Biener wrote:
 On Thu, 6 Apr 2017, Bernd Edlinger wrote:
> I think get_alias_set(t) will return 0 for typeless_storage
> types, and therefore has_zero_child will be set anyway.
> I think both mean the same thing in the end, but it depends on
> what typeless_storage should actually mean, and we have
> not yet the same idea about it.

 But has_zero_child does not do what we like it to because otherwise
 in the PR using the char[] array member would have worked!

 has_zero_child doesn't do that on purpose of course, but this means
 returing alias-set zero for the typeless storage _member_ doesn't
 suffice.

>>>
>>> I see you have a certain idea how to solve the C++17 issue.
>>> And yes, I apologize, if I tried to pee on your tree :)
>>
>> We do have the need to support this part of the C++ standard.  For other 
>> user code may_alias suffices and I see no reason to haste inventing sth new 
>> without a single convincing testcase.  GCC/Language extensions should not be 
>> added without a good reason.
>>
>> I didn't propose to expose the type flag to users at all.
>>
>> Richard.
>>
>
> Well, actually you did:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79671#c100
>
> But I won't argue if you prefer to do it as you like,
> after all TBAA is your area.  So that is fine for me.
>
> Attached is the latest version of my patch, I fixed
> the issue with the type-flag conversion, and it is
> already fully functional.

I agree that we don't want a new attribute.

This should not be limited to C++17 mode; std::aligned_storage is in
C++11 and we might as well have the same behavior in C++98 mode.

Jason


Re: [PATCH][PR target/80358][7 regression] Fix boundary check error in expand_block_compare

2017-04-07 Thread Segher Boessenkool
On Fri, Apr 07, 2017 at 02:38:50PM -0500, Aaron Sawdey wrote:
> Turns out we get passed const -1 for the length arg from this code.
> ROUND_UP adds load_mode_size to that resulting in a small positive
> number, hilarity ensues.

Glad you liked it as well ;-)

> Fixed by computing a sensible limit and using
> IN_RANGE instead, which won't overflow in this way.
> 
> OK for trunk if bootstrap/regtest in progress passes?

Yes, looks good.  Thanks,


Segher


> 2017-04-07  Aaron Sawdey  
> 
>   PR target/80358
>   * config/rs6000/rs6000.c (expand_block_compare): Fix boundary check.
> 
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c  (revision 246771)
> +++ gcc/config/rs6000/rs6000.c  (working copy)
> @@ -19672,8 +19672,9 @@
>unsigned int load_mode_size = GET_MODE_SIZE (load_mode);
>  
>/* We don't want to generate too much code.  */
> -  if (ROUND_UP (bytes, load_mode_size) / load_mode_size
> -  > (unsigned HOST_WIDE_INT) rs6000_block_compare_inline_limit)
> +  unsigned HOST_WIDE_INT max_bytes =
> +load_mode_size * (unsigned HOST_WIDE_INT) 
> rs6000_block_compare_inline_limit;
> +  if (!IN_RANGE (bytes, 1, max_bytes))
>  return false;
>  
>bool generate_6432_conversion = false;


Re: [C++ PATCH] New warning for extra semicolons after in-class function definitions

2017-04-07 Thread David Malcolm
On Fri, 2017-04-07 at 21:55 +0200, Volker Reichelt wrote:
> Hi,
> 
> with the following patch I suggest to add a diagnostic for extra
> semicolons after in-class member function definitions:
> 
>   struct A
>   {
> A() {};
> void foo() {};
> friend void bar() {};
>   };
> 
> Although they are allowed in the C++ standard, people (including me)
> often like to get rid of them for stylistic/consistency reasons.
> In fact clang has a warning -Wextra-semi for this.
> 
> Also in GCC (almost exactly 10 years ago) there was a patch
> https://gcc.gnu.org/ml/gcc-cvs/2007-03/msg00841.html
> to issue a pedwarn (which had to be reverted as GCC would reject
> valid
> code because of the pedwarn).
> 
> Instead of using pewarn the patch below adds a new warning (named
> like
> clang's) to warn about these redundant semicolons.
> Btw, clang's warning message "extra ';' after member function
> definition"
> is slightly incorrect because it is also emitted for friend functions
> which are not member-functions. That's why I suggest a different
> wording:
> 
>   Wextra-semi.C:3:9: warning: extra ';' after in-class function
> definition [-Wextra-semi]
>  A() {};
>^
>   Wextra-semi.C:4:16: warning: extra ';' after in-class function
> definition [-Wextra-semi]
>  void foo() {};
>   ^
>   Wextra-semi.C:5:23: warning: extra ';' after in-class function
> definition [-Wextra-semi]
>  friend void bar() {};
>  ^
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu.
> 
> OK for stage 1, once GCC 7 has branched?
> Regards,
> Volker
> 
> 
> 2017-04-07  Volker Reichelt  
> 
> * c.opt (Wextra-semi): New C++ warning flag.
> 
> Index: gcc/c-family/c.opt
> ===
> --- gcc/c-family/c.opt  (revision 246752)
> +++ gcc/c-family/c.opt  (working copy)
> @@ -504,6 +504,10 @@
>  C ObjC C++ ObjC++ Warning
>  ; in common.opt
>  
> +Wextra-semi
> +C++ Var(warn_extra_semi) Warning
> +Warn about semicolon after in-class function definition.
> +
>  Wfloat-conversion
>  C ObjC C++ ObjC++ Var(warn_float_conversion) Warning LangEnabledBy(C
> ObjC C++ ObjC++,Wconversion)
>  Warn for implicit type conversions that cause loss of floating point
> precision.
> 
> 2017-04-07  Volker Reichelt  
> 
> * parser.c (cp_parser_member_declaration): Add warning for
> extra semicolon after in-class function definition.
> 
> Index: gcc/cp/parser.c
> ===
> --- gcc/cp/parser.c (revision 246752)
> +++ gcc/cp/parser.c (working copy)
> @@ -23386,7 +23386,11 @@
>   token = cp_lexer_peek_token (parser->lexer);
>   /* If the next token is a semicolon, consume it. 
> */
>   if (token->type == CPP_SEMICOLON)
> -   cp_lexer_consume_token (parser->lexer);
> +   {
> + cp_lexer_consume_token (parser->lexer);
> + warning (OPT_Wextra_semi, "extra %<;%> "
> +  "after in-class function definition");
> +   }

Thanks for posting this.

I'm not a C++ maintainer, but I like the idea (though the patch is
missing at least a doc/invoke.texi change).

A small improvement to this would be to emit a deletion fix-it hint
about the redundant token (so that IDEs have a change of fixing it
easily).

This could be done something like this:

location_t semicolon_loc
   = cp_lexer_consume_token (parser->lexer)->location;
gcc_rich_location richloc (semicolon_loc);
richloc.add_fixit_remove ();
warning_at_richloc (&richloc, OPT_Wextra_semi,
"extra %<;%> after in-class function
definition");


>   goto out;
> }
>   else
> 
> 2017-04-07  Volker Reichelt  
> 
> * g++.dg/warn/Wextra-semi.C: New test.
> 
> Index: gcc/testsuite/g++.dg/warn/Wextra-semi.C
> ===
> --- gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
> +++ gcc/testsuite/g++.dg/warn/Wextra-semi.C 2017-04-07
> @@ -0,0 +1,8 @@
> +// { dg-options "-Wextra-semi" }
> +
> +struct A
> +{
> +  A() {};  // { dg-warning "after in-class function
> definition" }
> +  void foo() {};   // { dg-warning "after in-class function
> definition" }
> +  friend void bar() {};// { dg-warning "after in-class
> function definition" }
> +};
> 

If you implement the fix-it idea, then you can add 
-fdiagnostics-show-caret to the dg-options, and use something like:

/* { dg-begin-multiline-output "" }
  A() {};
^
-
   { dg-end-multiline-output "" } */

to express the expected output (copied and pasted from GCC's output). 
 You should omit the commented parts of the lines if you do (otherwise
DejaGnu gets very confused due to them containing dg- directives).

Hope this is constructive
Dave

[PATCH] Fix dwarf2out ICE with self-inlining (PR debug/80321)

2017-04-07 Thread Jakub Jelinek
Hi!

The following C and Ada testcases show ICE due to endless recursion in
dwarf2out.c.  The problem is that when processing BLOCK_NONLOCALIZED_VARS,
we want to treat all the FUNCTION_DECLs in there as mere declarations,
but gen_subprogram_die does:
  int declaration = (current_function_decl != decl
 || class_or_namespace_scope_p (context_die));
and thus if there is some self-inlining and we are unlucky enough
not to reach some early-outs that just ignore the FUNCTION_DECL,
like:
  /* Detect and ignore this case, where we are trying to output
 something we have already output.  */
  if (get_AT (old_die, DW_AT_low_pc)
  || get_AT (old_die, DW_AT_ranges))
return;
we will recurse infinitely.  The following patch fixes it by
just ignoring current_function_decl seen from BLOCK_NONLOCALIZED_VARS,
that implies it is already inlined into somewhere and in the abstract
origin we emit properly a DW_AT_declaration decl if needed, that is pretty
much what gen_subprogram_die would do anyway in such cases, because there
is already old_die, we really don't want to make some child of it its parent
and otherwise no further action is performed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Other possibilities include adding some global bool flag that
gen_subprogram_die's int declaration = ... above should ignore
decl == current_function_decl that we'd set in decls_for_scope when
seeing current_function_decl in BLOCK_NONLOCALIZED_VARS (and probably
save/clear + restore in dwarf2out_abstract_function where we change
current_function_decl).  Or we could pass through from decls_for_scope
down through process_scope_var, gen_decl_die to gen_subprogram_die
a bool flag force_declaration.

2017-04-07  Jakub Jelinek  

PR debug/80321
* dwarf2out.c (decls_for_scope): Ignore declarations of
current_function_decl in BLOCK_NONLOCALIZED_VARS.

* gcc.dg/debug/pr80321.c: New test.

2017-04-07  Eric Botcazou  

* gnat.dg/debug10.adb: New test.
* gnat.dg/debug10_pkg.ads: New helper.

--- gcc/dwarf2out.c.jj  2017-04-07 11:46:48.0 +0200
+++ gcc/dwarf2out.c 2017-04-07 20:00:43.503772542 +0200
@@ -24889,7 +24889,12 @@ decls_for_scope (tree stmt, dw_die_ref c
for (i = 0; i < BLOCK_NUM_NONLOCALIZED_VARS (stmt); i++)
  {
decl = BLOCK_NONLOCALIZED_VAR (stmt, i);
-   if (TREE_CODE (decl) == FUNCTION_DECL)
+   if (decl == current_function_decl)
+ /* Ignore declarations of the current function, while they
+are declarations, gen_subprogram_die would treat them
+as definitions again, because they are equal to
+current_function_decl and endlessly recurse.  */;
+   else if (TREE_CODE (decl) == FUNCTION_DECL)
  process_scope_var (stmt, decl, NULL_TREE, context_die);
else
  process_scope_var (stmt, NULL_TREE, decl, context_die);
--- gcc/testsuite/gcc.dg/debug/pr80321.c.jj 2017-04-07 21:39:01.930615179 
+0200
+++ gcc/testsuite/gcc.dg/debug/pr80321.c2017-04-07 21:39:49.722982635 
+0200
@@ -0,0 +1,26 @@
+/* PR debug/80321 */
+/* { dg-do compile } */
+/* { dg-options "-fkeep-inline-functions" } */
+
+void bar (void);
+
+static inline void
+test (int x)
+{
+  inline void
+  foo (int x)
+  {
+test (0);
+asm volatile ("" : : : "memory");
+  }
+  if (x != 0)
+foo (x);
+  else
+bar ();
+}
+
+void
+baz (int x)
+{
+  test (x);
+}
--- gcc/testsuite/gnat.dg/debug10.adb.jj2017-04-07 20:24:44.232473780 
+0200
+++ gcc/testsuite/gnat.dg/debug10.adb   2017-04-07 20:26:40.493980722 +0200
@@ -0,0 +1,68 @@
+-- PR debug/80321
+
+-- { dg-do compile }
+-- { dg-options "-O2 -g" }
+
+with Debug10_Pkg; use Debug10_Pkg;
+
+procedure Debug10 (T : Entity_Id) is
+
+   procedure Inner (E : Entity_Id);
+   pragma Inline (Inner);
+
+   procedure Inner (E : Entity_Id) is
+   begin
+  if E /= Empty
+ and then not Nodes (E + 3).Flag16
+  then
+ Debug10 (E);
+  end if;
+   end Inner;
+
+   function Ekind (E : Entity_Id) return Entity_Kind is
+   begin
+  return N_To_E (Nodes (E + 1).Nkind);
+   end Ekind;
+
+begin
+
+   if T = Empty then
+  return;
+   end if;
+
+   Nodes (T + 3).Flag16 := True;
+
+   if Ekind (T) in Object_Kind then
+  Inner (T);
+
+   elsif Ekind (T) in Type_Kind then
+  Inner (T);
+
+  if Ekind (T) in Record_Kind then
+
+ if Ekind (T) = E_Class_Wide_Subtype then
+Inner (T);
+ end if;
+
+  elsif Ekind (T) in Array_Kind then
+ Inner (T);
+
+  elsif Ekind (T) in Access_Kind then
+ Inner (T);
+
+  elsif Ekind (T) in Scalar_Kind then
+
+ if My_Scalar_Range (T) /= Empty
+   and then My_Test (My_Scalar_Range (T))
+ then
+if My_Is_Entity_Name (T) then
+   Inner (T);
+end if;
+
+   

Re: patch to fix PR70478

2017-04-07 Thread Jakub Jelinek
On Fri, Apr 07, 2017 at 12:04:16PM -0400, Vladimir Makarov wrote:
>   The following patch fixes
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478
> 
>   The patch was successfully bootstrapped and tested on x86-64/ppc64/arm64.
> 
>   Committed as rev. 246764.
> 

> Index: ChangeLog
> ===
> --- ChangeLog (revision 246763)
> +++ ChangeLog (working copy)
> @@ -1,3 +1,9 @@
> +2017-04-07  Vladimir Makarov  
> +
> + PR rtl-optimization/70478
> + * lra-constraints.c (process_alt_operands): Disfavor alternative
> + insn memory operands.

This has regressed:
+UNRESOLVED: gfortran.dg/pr68627.f   -O   scan-assembler-not vbroadcastsd[ 
t]+%xmm[0-9]+, %ymm[0-9]+
+FAIL: gfortran.dg/pr68627.f   -O  (internal compiler error)
+FAIL: gfortran.dg/pr68627.f   -O  (test for excess errors)
on x86_64-linux, starting with r246764 there is ICE:
Error: unable to find a register to spill
pr68627.f:16:0: Error: this is the insn:
(insn 202 1300 1177 12 (set (reg:V2DF 1306 [785])
(vec_concat:V2DF (reg:DF 1307 [orig:259 _282 ] [259])
(reg:DF 1421 [orig:263 _313 ] [263]))) "pr68627.f":11 2727 
{vec_concatv2df}
 (expr_list:REG_DEAD (reg:DF 1421 [orig:263 _313 ] [263])
(expr_list:REG_DEAD (reg:DF 1307 [orig:259 _282 ] [259])
(nil
pr68627.f:16:0: internal compiler error: in assign_by_spills, at 
lra-assigns.c:1476

Jakub


Re: [PATCH v2,rs6000] PR80108: Fix ICE with cross compiler

2017-04-07 Thread Bernhard Reutner-Fischer
On 7 April 2017 21:54:24 CEST, Segher Boessenkool  
wrote:
>Hi Kelvin,
>
>Sorry this has fallen by the wayside.  I did mark it "to do", and then
>promptly buried it.  Whoops.
>
>On Thu, Apr 06, 2017 at 08:36:00AM -0600, Kelvin Nilsen wrote:
>
>> +dg-test $gfortran_test_path/[lindex $args 1] ""
>> $gfortran_aux_module_flags
>
>Your mailer messed up this line (and some later).
>
>Okay for trunk.  Thanks!

I suspect /Burnas/s/a/u/
Why do you need a separate fortran.exp?
Can't this be a PPC test in the Fortran testsuit?

Thanks,


Re: [PATCH v2,rs6000] PR80108: Fix ICE with cross compiler

2017-04-07 Thread Segher Boessenkool
Hi Bernhard,

On Sat, Apr 08, 2017 at 02:54:31AM +0200, Bernhard Reutner-Fischer wrote:
> I suspect /Burnas/s/a/u/

Yeah; it seems to be wrong in the original testcase already
(gcc/testsuite/gfortran.dg/streamio_11.f90).

> Why do you need a separate fortran.exp?
> Can't this be a PPC test in the Fortran testsuit?

It could be.  But we have gone that way often enough now that perhaps
it makes more sense to have the tests in gcc.target/ instead.


Segher