[PATCH] Fix PR66916

2015-07-23 Thread Richard Biener

The following adds a single-use restriction to the widening/sing-change
comparison pattern which fixes PR66916.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  Verified the
arm testcase produces expected assembly with a cc1 cross.

Richard.

2015-07-22  Richard Biener  

PR middle-end/66916
* match.pd: Guard widen and sign-change comparison simplification
with single_use.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 226042)
+++ gcc/match.pd(working copy)
@@ -1679,7 +1719,8 @@ (define_operator_list CBRT BUILT_IN_CBRT
  type on targets that require function pointer canonicalization.  */
&& !(targetm.have_canonicalize_funcptr_for_compare ()
&& TREE_CODE (TREE_TYPE (@00)) == POINTER_TYPE
-   && TREE_CODE (TREE_TYPE (TREE_TYPE (@00))) == FUNCTION_TYPE))
+   && TREE_CODE (TREE_TYPE (TREE_TYPE (@00))) == FUNCTION_TYPE)
+   && single_use (@0))
(if (TYPE_PRECISION (TREE_TYPE (@00)) == TYPE_PRECISION (TREE_TYPE (@0))
&& (TREE_CODE (@10) == INTEGER_CST
|| (@1 != @10 && types_match (TREE_TYPE (@10), TREE_TYPE (@00



[Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-07-23 Thread Paul Richard Thomas
Dear All,

This is the third and final patch to implement submodules in gfortran.
It is the part that deals with private module entities. Unfortunately,
it is the most invasive and I would either like to have strong support
for it to be committed or a bright idea as to how to do it otherwise.

Since all the private entities in a module have to be transmitted to
their descendant submodules, whilst keeping them hidden from normal
use statements, I have chosen to write the module file as usual and
add a second part that contains the private entities. This latter is
only read when processing submodule statements.

I looked into encrypting the second part but could not find a way to
obtain the compression ratios that gzipping the module file affords,
largely from the repetition of attribute keywords. It was tempting to
reform completely the format of module files such that the symbol tree
is represented in binary format rather than in text. However, being
able to gunzip the files is very helpful from the diagnostic point of
view. Perhaps this is a suitable future upgrade for 6.0.0? That said,
I do not regard it as being high priority nor necessarily useful.

The other significant change is in respect of making module variable,
string length and procedure pointer declarations unconditionally
TREE_PUBLIC, whilst recycling the conditions to set DECL_VISIBILITY to
VISIBILITY_HIDDEN. This was a suggestion from Richard Biener, which
seems to do what is needed in libraries. This affects two existing
testcases: public_private_module_[2,6].f90, where xfails have been
added, where assembler symbols should be optimized away. These tests
can be removed if the above changes prove to be robust and acceptable
but I was reluctant to do this right away.

The rest of the patch is concerned with signaling to module.c that a
submodule statement is being processed.

It does cross my mind that all of this part of the submodule
implementation could be subject to the condition that a compiler
option is set. I am struck by the notion that making private module
entities available to submodules is an unnecessary complication and
that it amounts to be an error in the standard. This is why I am
suggesting the possibility of a specific compiler option.

The new testcase submodule_10.f08 is a near verbatim contribution from
Salvatore Filippone, for which thanks are due.

The remaining tasks are to try to fix PR66762, where submodule_6.f08
fails with -fto, and to update the documentation.

Bootstraps and regtests on FC21/x86_64 - OK for trunk?

Cheers

Paul

2015-07-23  Paul Thomas  

PR fortran/52846
* match.h : Add bool argument to gfc_use_modules so that it can
signal to module.c that a submodule statement is being
processed.
* module.c (read_module): Add new module_locus, 'end_module'.
Set it at the end of the public part of the module file. Then go
there once the public part has been processed, ready to read
the private part of the module file.
(check_access): Change original to 'check_access1' and call it
from 'check_access'. This latter inverts the result, according
to whether or not static 'invert_access' is true.
(gfc_dump_module): Write the public part of the module file as
before and then follow it with the private part, obtained by
setting 'invert_access' true. Once done, this is reset.
(gfc_use_module): Read the public part of the module file. If
this is a submodule and static 'submodule_stmt' is true, then
read the private part. This permits the private part of module
files to be respected with conventional use statements.
(gfc_use_modules): 'submodule_stmt' set true if the ancestor
module file is being used in processing submodule statement.
* parse.c (use_modules): Introduce 'using_ancestor_modules' as
a boolean argument. All calls set this argument false, except;
(parse_module): Call use_modules with 'using_ancestor_modules'
set true to signal the processing of a submodule statement.
* trans-decl.c (gfc_finish_var_decl, gfc_build_qualified_array,
get_proc_pointer_decl): Set TREE_PUBLIC unconditionally and use
the conditions to set DECL_VISIBILITY as hidden and to set as
true DECL_VISIBILITY_SPECIFIED.

2015-07-23  Paul Thomas  

PR fortran/52846

* gfortran.dg/public_private_module_2.f90: Add two XFAILS.
* gfortran.dg/public_private_module_6.f90: Add an XFAIL.
* gfortran.dg/submodule_10.f08: New test
Index: gcc/fortran/match.h
===
*** gcc/fortran/match.h (revision 226054)
--- gcc/fortran/match.h (working copy)
*** match gfc_match_expr (gfc_expr **);
*** 293,299 
  /* module.c.  */
  match gfc_match_use (void);
  match gfc_match_submodule (void);
! void gfc_use_modules (void);
  
  #endif  /* GFC_MATCH_H  */
  
--- 293,299 
  /* module.c.  */
  match gfc_match_use (void);
  match gfc_match_submodule (void);
! void gfc_

Re: [Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-07-23 Thread Damian Rouson


> On Jul 23, 2015, at 12:46 AM, Paul Richard Thomas 
>  wrote:
> 
> Since all the private entities in a module have to be transmitted to
> their descendant submodules, whilst keeping them hidden from normal
> use statements, I have chosen to write the module file as usual and
> add a second part that contains the private entities. This latter is
> only read when processing submodule statements.

Hi Paul,

Could you comment on whether this approach alleviates compilation cascades as 
seems to have been envisioned when submodules were added to the standard?  My 
guess is that a developer could adopt a policy of putting only public 
information in a
module and reserving all private information for submodules, which would 
mitigate
against unnecessary compilation cascades and would be consistent with putting
the interface in the module and the implementation in a submodule.. 

> It does cross my mind that all of this part of the submodule
> implementation could be subject to the condition that a compiler
> option is set. I am struck by the notion that making private module
> entities available to submodules is an unnecessary complication and
> that it amounts to be an error in the standard. This is why I am
> suggesting the possibility of a specific compiler option.

I strongly advocate against having to pass flags to force standard-compliant 
behavior 
(I happened to have just posted to c.l.f on a frustrating way in which two 
compilers
currently require flags to comply with the standard), although it sounds like 
it might 
not matter in this case if one adopts the aforementioned policy 
of putting only pubic information in modules.

Damian

Re: [Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-07-23 Thread Paul Richard Thomas
Dear Damian,

I do not think that there is any effect on compilation cascades. As
long as the private part of the module file remains unchanged, it will
not be recompiled if a descendant submodule is modified. Naturally,
the size of the module file is increased but, if one is careful, this
is not a big deal. A gotcha, which I will have to emphasize in the
documentation occurs if another module file is used and its symbols
are not exposed by public statements. If there are large numbers of
symbols this can have a big effect on the size of the module file. I
noticed this, when examining one of gfortran's testcases where the
ISO_C_BINDING intrinsic module is used. Generous sprinklings of USE
ONLYs are required to keep the module file sizes under control.

I am not over enthusiastic about using compilation flags to uphold
standards either.

Cheers

Paul

On 23 July 2015 at 10:22, Damian Rouson  wrote:
>
>
>> On Jul 23, 2015, at 12:46 AM, Paul Richard Thomas 
>>  wrote:
>>
>> Since all the private entities in a module have to be transmitted to
>> their descendant submodules, whilst keeping them hidden from normal
>> use statements, I have chosen to write the module file as usual and
>> add a second part that contains the private entities. This latter is
>> only read when processing submodule statements.
>
> Hi Paul,
>
> Could you comment on whether this approach alleviates compilation cascades as
> seems to have been envisioned when submodules were added to the standard?  My
> guess is that a developer could adopt a policy of putting only public 
> information in a
> module and reserving all private information for submodules, which would 
> mitigate
> against unnecessary compilation cascades and would be consistent with putting
> the interface in the module and the implementation in a submodule..
>
>> It does cross my mind that all of this part of the submodule
>> implementation could be subject to the condition that a compiler
>> option is set. I am struck by the notion that making private module
>> entities available to submodules is an unnecessary complication and
>> that it amounts to be an error in the standard. This is why I am
>> suggesting the possibility of a specific compiler option.
>
> I strongly advocate against having to pass flags to force standard-compliant 
> behavior
> (I happened to have just posted to c.l.f on a frustrating way in which two 
> compilers
> currently require flags to comply with the standard), although it sounds like 
> it might
> not matter in this case if one adopts the aforementioned policy
> of putting only pubic information in modules.
>
> Damian



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx


Re: libstdc++: more __intN tweaks

2015-07-23 Thread Jonathan Wakely

On 22/07/15 22:26 -0400, DJ Delorie wrote:


Another place where a list of "all" types are explicitly listed, and
the __intN types need to be included, and elsewhere protection against
errors [-Wnarrowing] on targets that have small size_t.  Ok?

* include/bits/functional_hash.h: Add specializations for __intN
types.

* include/ext/pb_ds/detail/thin_heap_/thin_heap_.hpp (__gnu_pbds):
Guard against values that might exceed size_t's precision.


Yes, OK - thanks.



Re: [AArch64/wwwdoc] Document -fpic support for small memory model

2015-07-23 Thread Jiong Wang

James Greenhalgh writes:

> On Fri, Jun 26, 2015 at 02:45:39PM +0100, Jiong Wang wrote:
>> 
>> Marcus Shawcroft writes:
>> 
>> 2015-06-26  Jiong Wang  
>> 
>> wwwdocs/
>>   * htdocs/gcc-6/changes.html (AArch64): Document -fpic for small model.
>> 
>
>> Index: gcc-6/changes.html
>> ===
>> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
>> retrieving revision 1.12
>> diff -u -r1.12 changes.html
>> --- gcc-6/changes.html   16 Jun 2015 08:48:02 -  1.12
>> +++ gcc-6/changes.html   26 Jun 2015 13:30:05 -
>> @@ -90,6 +90,15 @@
>> If GCC is unable to detect the host CPU these options have no effect.
>>   
>> 
>> +   
>
> This should be a new  (list item) in the above  (unordered list),
> rather than a new .

thanks, fixed.

>
>> + 
>> +   -fpic is now supported on AArch64 for small memory
>> +   model. 
>
> In invoke.texi we describe -mcmodel as the "small code model" rather
> than as a "memory model". How about rewording this as so:
>
>   -fpic is now supported by the AArch64 target when generating
>   code for the small code model (-mcmodel=small).

fixed.

>
>> Compared with -fPIC, -fpic
>> +   will guide GCC to generate more efficient position independent
>> +   instruction sequences when accessing global objects and
>> +   28KiB/15KiB global offset table size supported under ILP64/32.
>
> I'm not sure this part is needed, the difference between -fpic and -fPIC
> is already covered by invoke.texi. If you do want to include this text,
> I might try rewriting it as:
>
>   -fpic generates position-independent code which accesses all
>   constant addresses through a global offset table (GOT). For AArch64, the
>   size of the GOT is limited to 28KiB under the LP64 SysV ABI, and 15KiB
>   under the ILP32 SysV ABI.

As this page documents changes, I combined two of your rewording together as:

  -fpic is now supported by the AArch64 target when generating
  code for the small code model (-mcmodel=small).  The size of
  the GOT is limited to 28KiB under the LP64 SysV ABI, and 15KiB under the
  ILP32 SysV ABI.
  
>
> As I was looking in invoke.texi, do we want to document the limits on our
> GOT size there as other targets have?

Maybe, I haven't touch invoke.texi in this patch.

>
> "These maximums are 8k on the SPARC and 32k on the m68k and RS/6000.
>  The x86 has no such limit."

patch updated. Ok for trunk?

2015-07-23  Jiong Wang  
 
wwwdocs/
  * htdocs/gcc-6/changes.html (AArch64): Document -fpic for small model.

-- 
Regards,
Jiong

Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.12
diff -u -r1.12 changes.html
--- htdocs/gcc-6/changes.html	16 Jun 2015 08:48:02 -	1.12
+++ htdocs/gcc-6/changes.html	23 Jul 2015 08:45:11 -
@@ -89,6 +89,12 @@
rewrite these options to the optimal setting for that system.
If GCC is unable to detect the host CPU these options have no effect.
  
+ 
+   -fpic is now supported by the AArch64 target when generating
+   code for the small code model (-mcmodel=small).  The size of
+   the GOT is limited to 28KiB under the LP64 SysV ABI, and 15KiB under the
+   ILP32 SysV ABI.
+ 

 
 


[gomp4] libgomp: Some torture testing for C and C++ OpenACC test cases (was: [gomp] Move openacc vector& worker single handling to RTL)

2015-07-23 Thread Thomas Schwinge
Hi!

On Wed, 22 Jul 2015 12:47:32 -0400, Nathan Sidwell  wrote:
> On 07/20/15 11:08, Nathan Sidwell wrote:
> > On 07/20/15 09:01, Nathan Sidwell wrote:
> >> On 07/18/15 11:37, Thomas Schwinge wrote:
> >>> For OpenACC nvptx offloading, there must still be something wrong; here's
> >>> a count of the (non-deterministic!) regressions of ten runs of the
> >>> libgomp testsuite.

> Thomas helped me reproduce them -- they are very intermittent.  Anyway, fixed 
> with the attached patch I've committed to gomp branch.

\o/

> This appears to fix all the -O0 regressions you observed Thomas.

Thanks, confirmed!


To get better test coverage for device-specific code that is only ever
used in offloading configurations, it's a good idea to do a (limited) set
of torture testing also for some libgomp C and C++ test cases (it's done
for all testing in Fortran): those that are dealing with the specifics of
gang/worker/vector single/redundant/partitioned modes.  They're selected
based on their file names -- not a perfect property to detect such test
cases, but should be sufficient.  To avoid testing time exploding too
much, limit any torture testing to -O0 and -O2 only, under the assumption
that between -O0 and -O[something] there is the biggest difference in the
overall structure of the generated code.

Committed to gomp-4_0-branch in r226091:

commit b1bd5f92c3f536ebab9b36510636c7ab845123f8
Author: tschwinge 
Date:   Thu Jul 23 08:50:15 2015 +

libgomp: Some torture testing for C and C++ OpenACC test cases

libgomp/
* testsuite/libgomp.oacc-c++/c++.exp: Run ttests with
gcc-dg-runtest.
* testsuite/libgomp.oacc-c/c.exp: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@226091 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp |  6 ++
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 26 ++
 libgomp/testsuite/libgomp.oacc-c/c.exp | 25 +
 3 files changed, 57 insertions(+)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 33e7b3b..b5ace3f 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-07-23  Thomas Schwinge  
+
+   * testsuite/libgomp.oacc-c++/c++.exp: Run ttests with
+   gcc-dg-runtest.
+   * testsuite/libgomp.oacc-c/c.exp: Likewise.
+
 2015-07-22  Thomas Schwinge  
 
* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Remove explicit
diff --git libgomp/testsuite/libgomp.oacc-c++/c++.exp 
libgomp/testsuite/libgomp.oacc-c++/c++.exp
index 7309f78..3dbc917 100644
--- libgomp/testsuite/libgomp.oacc-c++/c++.exp
+++ libgomp/testsuite/libgomp.oacc-c++/c++.exp
@@ -1,5 +1,12 @@
 # This whole file adapted from libgomp.c++/c++.exp.
 
+# To avoid testing time exploding too much, limit any torture testing to -O0
+# and -O2 only, under the assumption that between -O0 and -O[something] there
+# is the biggest difference in the overall structure of the generated code.
+set TORTURE_OPTIONS [list \
+{ -O0 } \
+{ -O2 } ]
+
 load_lib libgomp-dg.exp
 load_gcc_lib gcc-dg.exp
 
@@ -61,6 +68,22 @@ if { $lang_test_file_found } {
 set tests [lsort [concat \
  [find $srcdir/$subdir *.C] \
  [find $srcdir/$subdir/../libgomp.oacc-c-c++-common 
*.c]]]
+# To get better test coverage for device-specific code that is only ever
+# used in offloading configurations, we'd like more thorough (torture)
+# testing for test cases that are dealing with the specifics of
+# gang/worker/vector single/redundant/partitioned modes.  They're selected
+# based on their file names -- not a perfect property to detect such test
+# cases, but should be sufficient.
+set ttests [lsort -unique [concat \
+  [find 
$srcdir/$subdir/../libgomp.oacc-c-c++-common *gang*.c] \
+  [find 
$srcdir/$subdir/../libgomp.oacc-c-c++-common *worker*.c] \
+  [find 
$srcdir/$subdir/../libgomp.oacc-c-c++-common *vec*.c]]]
+# tests := tests - ttests.
+foreach t $ttests {
+   set i [lsearch -exact $tests $t]
+   set tests [lreplace $tests $i $i]
+}
+
 
 if { $blddir != "" } {
 set ld_library_path 
"$always_ld_library_path:${blddir}/${lang_library_path}"
@@ -116,6 +139,7 @@ if { $lang_test_file_found } {
set tagopt "$tagopt -DACC_MEM_SHARED=$acc_mem_shared"
 
dg-runtest $tests "$tagopt" "$libstdcxx_includes $DEFAULT_CFLAGS"
+   gcc-dg-runtest $ttests "$tagopt" "$libstdcxx_includes"
 }
 }
 
@@ -124,5 +148,7 @@ if { [info exists HAVE_SET_GXX_UNDER_TEST] } {
 unset GXX_UNDER_TEST
 }
 
+unset TORTURE_OPTIONS
+
 # All done.
 dg-finish
diff --git libgomp/testsuite/libgomp.oacc-c/c.exp 
libgomp/testsuite/libgomp.oacc-c/c.exp
index 60be15d..988dfc6 100644
--- libgomp/testsuite/libgomp.oacc-c/c.exp
+++ libgomp/testsuite/libgomp.oacc-c/c.exp
@@ -11

Re: [AArch64/wwwdoc] Document -fpic support for small memory model

2015-07-23 Thread James Greenhalgh
On Thu, Jul 23, 2015 at 09:38:26AM +0100, Jiong Wang wrote:
> 
> James Greenhalgh writes:
> 
> > On Fri, Jun 26, 2015 at 02:45:39PM +0100, Jiong Wang wrote:
> >> 
> >> Marcus Shawcroft writes:
> >> 
> >> 2015-06-26  Jiong Wang  
> >> 
> >> wwwdocs/
> >>   * htdocs/gcc-6/changes.html (AArch64): Document -fpic for small model.
> >> 
> >
> >> Index: gcc-6/changes.html
> >> ===
> >> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
> >> retrieving revision 1.12
> >> diff -u -r1.12 changes.html
> >> --- gcc-6/changes.html 16 Jun 2015 08:48:02 -  1.12
> >> +++ gcc-6/changes.html 26 Jun 2015 13:30:05 -
> >> @@ -90,6 +90,15 @@
> >> If GCC is unable to detect the host CPU these options have no 
> >> effect.
> >>   
> >> 
> >> +   
> >
> > This should be a new  (list item) in the above  (unordered list),
> > rather than a new .
> 
> thanks, fixed.
> 
> >
> >> + 
> >> +   -fpic is now supported on AArch64 for small memory
> >> +   model. 
> >
> > In invoke.texi we describe -mcmodel as the "small code model" rather
> > than as a "memory model". How about rewording this as so:
> >
> >   -fpic is now supported by the AArch64 target when generating
> >   code for the small code model (-mcmodel=small).
> 
> fixed.
> 
> >
> >> Compared with -fPIC, -fpic
> >> +   will guide GCC to generate more efficient position independent
> >> +   instruction sequences when accessing global objects and
> >> +   28KiB/15KiB global offset table size supported under ILP64/32.
> >
> > I'm not sure this part is needed, the difference between -fpic and -fPIC
> > is already covered by invoke.texi. If you do want to include this text,
> > I might try rewriting it as:
> >
> >   -fpic generates position-independent code which accesses all
> >   constant addresses through a global offset table (GOT). For AArch64, the
> >   size of the GOT is limited to 28KiB under the LP64 SysV ABI, and 15KiB
> >   under the ILP32 SysV ABI.
> 
> As this page documents changes, I combined two of your rewording together as:
> 
>   -fpic is now supported by the AArch64 target when generating
>   code for the small code model (-mcmodel=small).  The size of
>   the GOT is limited to 28KiB under the LP64 SysV ABI, and 15KiB under the
>   ILP32 SysV ABI.

As you haven't introduced it elsewhere on this page,

s/GOT/global offset table (GOT)/

To help those who might not be aware of the meaning of the acronym.

>   
> >
> > As I was looking in invoke.texi, do we want to document the limits on our
> > GOT size there as other targets have?
> 
> Maybe, I haven't touch invoke.texi in this patch.
> 
> >
> > "These maximums are 8k on the SPARC and 32k on the m68k and RS/6000.
> >  The x86 has no such limit."
> 
> patch updated. Ok for trunk?

OK with the above change.

Thanks,
James

> 
> 2015-07-23  Jiong Wang  
>  
> wwwdocs/
>   * htdocs/gcc-6/changes.html (AArch64): Document -fpic for small model.
> 



RE: [PATCH, MIPS] I6400 scheduling

2015-07-23 Thread Robert Suchanek
Hi,

> PTF_AVOID_BRANCHLIKELY replaced with 0 in all 3 cases.
> AFAICS, there is no need to update the option handling code. The branch
> likely will not be enabled as it is additionally guarded by
> ISA_HAS_BRANCHLIKELY.
> 
> >
> > OK with those changes.
> 
> I'll commit the updated patch once the build completes.

Committed as r226090.

Regards,
Robert


Re: [PATCH][match.pd] PR middle-end/66915 Restrict A - B -> A + (-B) to non-fixed-point types

2015-07-23 Thread Kyrill Tkachov


On 21/07/15 11:11, Richard Biener wrote:

On Tue, 21 Jul 2015, Kyrill Tkachov wrote:


On 21/07/15 08:24, Richard Biener wrote:

On Mon, 20 Jul 2015, Kyrill Tkachov wrote:


Hi all,

This patch fixes the PR in question which is a miscompilation of
gcc.dg/fixed-point/unary.c on arm.
It just restricts the A - B -> A + (-B) transformation when the type is
fixed-point.

This fixes the testcase for me.
Is this the right approach?

Bootstrap and test on arm and x86 running.

Ok if testing is clean?

Ok, but I think the fold-const.c code has the same issue, no:

/* A - B -> A + (-B) if B is easily negatable.  */
if (negate_expr_p (arg1)
&& !TYPE_OVERFLOW_SANITIZED (type)
&& ((FLOAT_TYPE_P (type)
 /* Avoid this transformation if B is a positive REAL_CST.
*/
 && (TREE_CODE (arg1) != REAL_CST
 ||  REAL_VALUE_NEGATIVE (TREE_REAL_CST (arg1
|| INTEGRAL_TYPE_P (type)))
  return fold_build2_loc (loc, PLUS_EXPR, type,
  fold_convert_loc (loc, type, arg0),
  fold_convert_loc (loc, type,
negate_expr (arg1)));

ah, no.  The above only applies to float-type and integral-types.

Thus yes, your patch is ok.  Can you double-check the other pattern,

/* -(A + B) -> (-B) - A.  */
(simplify
   (negate (plus:c @0 negate_expr_p@1))
   (if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
&& !HONOR_SIGNED_ZEROS (element_mode (type)))
(minus (negate @1) @0)))

?

Thanks, committed with r226028.
I can add (FLOAT_TYPE_P (type) || INTEGRAL_TYPE_P (type)) to the condition.
That would more closely mirror the original logic, right?
That passes x86_64 bootstrap and aarch64 testing looks ok.

Yeah, that works for me, too.


How about this patch then?
Bootstrapped and tested on x86_64 and aarch64.

Thanks,
Kyrill

2015-07-23  Kyrylo Tkachov  

* match.pd (-(A + B) -> (-B) - A): Restrict to floating point
and integral types.




Thanks,
Richard.



commit d514c81a7965fd24b9d8c294b12179b2369c8aa4
Author: Kyrylo Tkachov 
Date:   Tue Jul 21 10:18:31 2015 +0100

[match.pd] Restrict -(A + B) -> (-B) - A to integral or float types

diff --git a/gcc/match.pd b/gcc/match.pd
index 3d7b32e..29367f2 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -515,7 +515,8 @@ along with GCC; see the file COPYING3.  If not see
 /* -(A + B) -> (-B) - A.  */
 (simplify
  (negate (plus:c @0 negate_expr_p@1))
- (if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
+ (if ((FLOAT_TYPE_P (type) || INTEGRAL_TYPE_P (type))
+  && !HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
   && !HONOR_SIGNED_ZEROS (element_mode (type)))
   (minus (negate @1) @0)))
 


Re: [PATCH] Fix PR66952

2015-07-23 Thread Andreas Schwab
Richard Biener  writes:

> Index: gcc/testsuite/gcc.dg/torture/pr66952.c
> ===
> --- gcc/testsuite/gcc.dg/torture/pr66952.c(revision 0)
> +++ gcc/testsuite/gcc.dg/torture/pr66952.c(working copy)
> @@ -0,0 +1,28 @@
> +/* { dg-do run } */
> +
> +int a = 128, b;
> +
> +static int
> +fn1 (char p1, int p2)
> +{
> +  return p1 < 0 || p1 > 1 >> p2 ? 0 : p1 << 1;

This is broken, p1 can never be < 0.

FAIL: gcc.dg/torture/pr66952.c   -O0  execution test

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH][combine][committed] Use std::swap in try_combine

2015-07-23 Thread Kyrill Tkachov

Hi all,

This obvious patch replaces yet another instance of manual swapping by 
std::swap.

Bootstrapped and tested on aarch64 and x86_64.
Applied to trunk with r226094.

Thanks,
Kyrill

2015-07-23  Kyrylo Tkachov  

* combine.c (try_combine): Use std::swap instead of manually
swapping.

commit d0d5686b87bf842b46ba460a943ef7f826656968
Author: Kyrylo Tkachov 
Date:   Tue Jul 21 15:20:18 2015 +0100

[combine][obvious] Use std::swap in try_combine

diff --git a/gcc/combine.c b/gcc/combine.c
index 2f806ab..e47cbc4 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2730,11 +2730,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
   /* If multiple insns feed into one of I2 or I3, they can be in any
  order.  To simplify the code below, reorder them in sequence.  */
   if (i0 && DF_INSN_LUID (i0) > DF_INSN_LUID (i2))
-temp_insn = i2, i2 = i0, i0 = temp_insn;
+std::swap (i0, i2);
   if (i0 && DF_INSN_LUID (i0) > DF_INSN_LUID (i1))
-temp_insn = i1, i1 = i0, i0 = temp_insn;
+std::swap (i0, i1);
   if (i1 && DF_INSN_LUID (i1) > DF_INSN_LUID (i2))
-temp_insn = i1, i1 = i2, i2 = temp_insn;
+std::swap (i1, i2);
 
   added_links_insn = 0;
 


Re: [PATCH][AArch64][9/14] Implement TARGET_CAN_INLINE_P

2015-07-23 Thread Kyrill Tkachov


On 21/07/15 17:07, James Greenhalgh wrote:

On Thu, Jul 16, 2015 at 04:21:02PM +0100, Kyrill Tkachov wrote:

Hi all,

This patch implements the target-specific inlining rules.
The basic philosophy is that we want to definitely reject inlining if the 
callee's architecture
is not a subset, feature-wise, of the caller's.

Beyond that, we want to allow inlining if the callee is always_inline.
If it's not, we reject inlining if the TargetSave options don't match up
in a way that's described in the comments in the patch.

Generally, we try to allow as much inlining as possible for the benefit of LTO.
However, if the architectural features of the callee are not a subset of the 
features
of the caller, then we must reject inlining. For example, inlining a function 
with 'simd'
into a function without 'simd' is not allowed.

Also, inlining a non-strict-align function into a strict-align function is not 
allowed.
These two restrictions apply even when the callee is tagged with always_inline 
because they
can affect the correctness of the program.

Beyond that, we reject inlining only if the user has explicitly specified 
attributes/options for
both the caller and the callee and they don't match up.

An exception to that are the tuning CPUs. We want to allow inlining even when 
the tuning CPUs don't match.

Bootstrapped and tested on aarch64.

Ok for trunk?

Comments below.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0a6ed70..34cd986 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8486,6 +8486,113 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, 
tree args, int)
return ret;
  }
  
+/* Helper for aarch64_can_inline_p.  In the case where CALLER and CALLEE are

+   tri-bool options (yes, no, don't care) and the default value is
+   DEF, determine whether to reject inlining.  */
+
+static bool
+aarch64_reject_inlining (int caller, int callee, int dont_care, int def)
+{
+  /* If both caller and callee care about the value then reject inlining
+ if they don't match up.  */
+  if (caller != dont_care && callee != dont_care && caller != callee)
+return true;
+
+  /* If caller doesn't care then make sure that the default agrees
+ with the callee.  */
+  if (caller == dont_care && callee != dont_care && callee != def)
+return true;

I don't like boolean functions which return the negative of how they
will be used, so how about flipping the sense of this function to be
positive and cleaning up the logical cases to be more explicit:

   static bool
   aarch64_tribool_values_ok_for_inlining_p (int caller, int callee,
int dont_care, int def)
   {
 /* If the callee doesn't care, always allow inlining.  */
 if (callee == dont_care)
   return true;

 /* If the caller doesn't care, always allow inlining.  */
 if (caller == dont_care)
   return true;

 /* Otherwise, allow inlining if either the callee and caller values
agree, or if the callee is using the default value.
 return (callee == caller || callee == def)
   }


+/* Implement TARGET_CAN_INLINE_P.  Decide whether it is valid
+   to inline CALLE into CALLER based on target-specific info.

s/CALLE/CALLEE/


+   Make sure that the caller and callee have compatible architectural
+   features.  Then go through the other possible target attributes
+   and.  Try not to reject always_inline callees unless they are

Typo: and... what?


+   incompatible architecturally.  */
+
+static bool
+aarch64_can_inline_p (tree caller, tree callee)
+{
+  bool ret = false;
+  tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
+  tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
+
+  /* If callee has no option attributes, then it is ok to inline.  */
+  if (!callee_tree)
+ret = true;

Just return true, drop the else and clean up the control flow, particularly
as the rest of the code below is going to fall through to each condition
in turn.


+  else
+{
+  struct cl_target_option *caller_opts
+   = TREE_TARGET_OPTION (caller_tree ? caller_tree
+  : target_option_default_node);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+
+
+  /* Callee's ISA flags should be a subset of the caller's.  */
+  if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags)
+ == callee_opts->x_aarch64_isa_flags)
+   ret = true;

This is the only place that ret can become true. Why not switch the sense
of the comparison (!= callee_opts...) and simply return false here.

Then each of the cases below can just return false as they go and the
fallthrough case can return true.

That seems much more readable to me!


Thanks, I've implemented the suggestions.
Re-bootstrapped and tested on aarch64.
How's this?

Thanks,
Kyrill

2015-07-23  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_tribools_ok_for_inl

Re: [PATCH] Document ftrapv/fwrapv interaction

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 5:11 PM, Tom de Vries  wrote:
> [ Re: [RFC, PR66873] Use graphite for parloops ]
> On 22/07/15 13:01, Richard Biener wrote:
>>
>> why only scalar floats?  Please use FLOAT_TYPE_P.
>>
>> +  if (INTEGRAL_TYPE_P (type))
>> +return (!TYPE_OVERFLOW_TRAPS (type)
>> +   && TYPE_OVERFLOW_WRAPS (type));
>>
>> it cannot both wrap and trap thus TYPE_OVERFLOW_WRAPS is enough.
>
>
> Hmm, indeed, when specifying both, one is quietly ignored. The documentation
> also doesn't mention this.
>
> Attached untested patch mentions this ftrapv/fwrapv interaction in the docs.
>
> OK for trunk, if bootstrap succeeds?

Ok.

Richard.

> Thanks,
> - Tom
>
>


Re: [PATCH] Document ftrapv/fwrapv interaction

2015-07-23 Thread Richard Biener
On Thu, Jul 23, 2015 at 12:19 PM, Richard Biener
 wrote:
> On Wed, Jul 22, 2015 at 5:11 PM, Tom de Vries  wrote:
>> [ Re: [RFC, PR66873] Use graphite for parloops ]
>> On 22/07/15 13:01, Richard Biener wrote:
>>>
>>> why only scalar floats?  Please use FLOAT_TYPE_P.
>>>
>>> +  if (INTEGRAL_TYPE_P (type))
>>> +return (!TYPE_OVERFLOW_TRAPS (type)
>>> +   && TYPE_OVERFLOW_WRAPS (type));
>>>
>>> it cannot both wrap and trap thus TYPE_OVERFLOW_WRAPS is enough.
>>
>>
>> Hmm, indeed, when specifying both, one is quietly ignored. The documentation
>> also doesn't mention this.
>>
>> Attached untested patch mentions this ftrapv/fwrapv interaction in the docs.
>>
>> OK for trunk, if bootstrap succeeds?
>
> Ok.

Btw, for consistency we probably should add

-fsigned-overflow=traps|wraps|undefined

and make -ftrapv and -fwrapv alias to the respective behavior.

Oh, and -fstrict-overflow is another beast with rather unspecified
behavior... while it's positive form could be aliased to
-fsinged-overflow=undefined it's negative form is _not_ equal
to -fwrapv - it's a third state that says overflow is neither known
to wrap nor undefined (thus it allows even less optimizations).
Note that the behavior of -fno-strict-overflow isn't documented
(only it's postiive form is).  This means that at -O[10] where
-fno-strict-overflow is in effect we are in "undefined" territory.

Maybe it's time to fix that ...

Richard.

> Richard.
>
>> Thanks,
>> - Tom
>>
>>


Re: [PATCH] Fix PR66952

2015-07-23 Thread Kyrill Tkachov


On 23/07/15 10:02, Andreas Schwab wrote:

Richard Biener  writes:


Index: gcc/testsuite/gcc.dg/torture/pr66952.c
===
--- gcc/testsuite/gcc.dg/torture/pr66952.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr66952.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+
+int a = 128, b;
+
+static int
+fn1 (char p1, int p2)
+{
+  return p1 < 0 || p1 > 1 >> p2 ? 0 : p1 << 1;

This is broken, p1 can never be < 0.


Just mark the chars as signed chars?
The testcase passes for me with that change.

Kyrill



FAIL: gcc.dg/torture/pr66952.c   -O0  execution test

Andreas.





Re: [PATCH] Don't allow unsafe reductions in graphite

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 6:00 PM, Tom de Vries  wrote:
> [ was: Re: [RFC, PR66873] Use graphite for parloops ]
>
> On 22/07/15 13:02, Richard Biener wrote:
>>
>> On Wed, Jul 22, 2015 at 1:01 PM, Richard Biener
>>   wrote:
>>>
>>> >On Tue, Jul 21, 2015 at 8:42 PM, Sebastian Pop  wrote:

 >>Tom de Vries wrote:
>
> >>>Fix reduction safety checks
> >>>
> >>>   * graphite-sese-to-poly.c (is_reduction_operation_p): Limit
> >>>   flag_associative_math to SCALAR_FLOAT_TYPE_P.  Honour
> >>>   TYPE_OVERFLOW_TRAPS and TYPE_OVERFLOW_WRAPS for
> >>> INTEGRAL_TYPE_P.
> >>>   Only allow wrapping fixed-point otherwise.
> >>>   (build_poly_scop): Always call
> >>>   rewrite_commutative_reductions_out_of_ssa.

 >>
 >>The changes to graphite look good to me.
>>>
>>> >
>>> >+  if (SCALAR_FLOAT_TYPE_P (type))
>>> >+return flag_associative_math;
>>> >+
>>> >
>>> >why only scalar floats?
>
>
> Copied from the conditions in vect_is_simple_reduction_1.
>
>>> >Please use FLOAT_TYPE_P.
>
> Done.
>
>>> >
>>> >+  if (INTEGRAL_TYPE_P (type))
>>> >+return (!TYPE_OVERFLOW_TRAPS (type)
>>> >+   && TYPE_OVERFLOW_WRAPS (type));
>>> >
>>> >it cannot both wrap and trap thus TYPE_OVERFLOW_WRAPS is enough.
>>> >
>
>
> Done.
>
>>> >I'm sure you'll disable quite some parallelization this way... (the
>>> >routine is modeled after
>>> >the vectorizers IIRC, so it would be affected as well).  Yeah - I see
>>> >you modify autopar
>>> >testcases.
>
>
> I now split up the patch, this bit only relates to graphite, so no autopar
> testcases are affected.
>
>>> >Please instead XFAIL the existing ones and add variants
>>> >with unsigned
>>> >reductions.  Adding -fwrapv isn't a good solution either.
>
>
> Done.
>
>>> >
>>> >Can you think of a testcase that breaks btw?
>>> >
>
>
> If you mean a testcase that fails to execute properly with the fix, and
> executes correctly with the fix, then no.  The problem this patch is trying
> to fix, is that we assume wrapping overflow without fwrapv. In order to run
> into a runtime failure, we need a target that does not do wrapping overflow
> without fwrapv.
>
>>> >The "proper" solution (see other passes) is to rewrite the reduction
>>> >to a wrapping
>>> >one (cast to unsigned for the reduction op).
>>> >
>
>
> Right.
>
>>> >+  return (FIXED_POINT_TYPE_P (type)
>>> >+ && FIXED_POINT_TYPE_OVERFLOW_WRAPS_P (type));
>>> >
>>> >why?
>
>
> Again, copied from the conditions in vect_is_simple_reduction_1.
>
>>> >  Simply return false here instead?
>
> Done.
>
>
> [ Btw, looking at associative_tree_code, I realized that the
>   overflow checking is only necessary for PLUS_EXPR and MULT_EXPR:
> ...
>   switch (code)
> {
> case BIT_IOR_EXPR:
> case BIT_AND_EXPR:
> case BIT_XOR_EXPR:
> case PLUS_EXPR:
> case MULT_EXPR:
> case MIN_EXPR:
> case MAX_EXPR:
>   return true;
> ...
>
> The other operators cannot overflow to begin with. My guess is that it's
> better to leave this for a trunk-only follow-up patch.
> ]
>
> Currently bootstrapping and reg-testing on x86_64.
>
> OK for trunk?
>
> OK 5 and 4.9 release branches?

Ok if Sebastian is fine with it.

Richard.

> Thanks,
> - Tom
>


Re: [PATH PR66926,PR66951} simple fix for ICE.

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 6:09 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is simple fix which fixes PR66926 and PR66951 - fix condition for
> renaming virtual operands to determine that statement is outside of
> loop.
>
> Bootstrap and regression testing did not show any new failures.
>
> Is it OK for trunk?

Ok.

Richard.

> gcc/ChangeLog
> 2015-07-22  Yuri Rumyantsev  
>
> PR tree-optimization/66926,66951
> * tree-vect-loop-manip.c (slpeel_tree_peel_loop_to_edge): Delete
> INNER_LOOP and fix up condition for renaming virtual operands.
>
>
> gcc/testsuite/ChangeLog
> * gcc.dg/vect/pr66951.c: New test.


Re: [PATCH] Check TYPE_OVERFLOW_WRAPS for parloops reductions

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 6:13 PM, Tom de Vries  wrote:
> [ was: Re: [RFC, PR66873] Use graphite for parloops ]
>
> On 22/07/15 13:02, Richard Biener wrote:
>>
>> On Wed, Jul 22, 2015 at 1:01 PM, Richard Biener
>>  wrote:
>>>
>>> On Tue, Jul 21, 2015 at 8:42 PM, Sebastian Pop  wrote:

 Tom de Vries wrote:
>
> Fix reduction safety checks
>
>
>>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>>> index 9145dbf..e014be2 100644
>>> --- a/gcc/tree-vect-loop.c
>>> +++ b/gcc/tree-vect-loop.c
>>> @@ -2613,16 +2613,30 @@ vect_is_simple_reduction_1 (loop_vec_info
>>> loop_info, gimple phi,
>>>  "reduction: unsafe fp math optimization: ");
>>> return NULL;
>>>   }
>>> -  else if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_TRAPS (type)
>>> -  && check_reduction)
>>> +  else if (INTEGRAL_TYPE_P (type) && check_reduction)
>>>   {
>>> ...
>>>
>>> You didn't need to adjust any testcases?
>>>  That's probably because the
>>> checking above is
>>> not always executed (see PR66623 for a related testcase).  The code
>>> needs refactoring.
>>> And we need a way-out, that is, we do _not_ want to not vectorize
>>> signed reductions.
>>> So you need to fix code generation instead.
>>
>>
>> Btw, for the vectorizer the current "trick" is that nobody takes advantage
>> about
>> overflow undefinedness for vector types.
>>
>
> AFAIU, you're saying here that there's no current bug related to assuming
> wrapping overflow in the vectorizer?

Well - TYPE_OVERFLOW_UNDEFINED will happily return true for
vector integer types but nothing I know will exploit that (bogus) knowledge.

And I'd rather change the reporting of TYPE_OVERFLOW_UNDEFINED here
as the C standard doesn't have vector types and the middle-end cannot
distinguish
user-written code (via intrinsics now using the generic vector GCC
language extension)
from compiler-generated code.

Similar for _Complex integer types (also a GCC extension?).

> I've updated the patch accordingly, so we only bother about
> TYPE_OVERFLOW_WRAPS for parloops reductions.
>
> Currently bootstrapping and reg-testing on x86_64.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
>


[PATCHES, PING] Enhance standard DWARF for Ada

2015-07-23 Thread Pierre-Marie de Rodat

On 07/16/2015 10:34 AM, Pierre-Marie de Rodat wrote:

This patch series aims at enhancing GCC to emit standard DWARF in place
of the current GNAT encodings (non-standard DWARF) for a set of "basic"
types: dynamic arrays, variable-length records, variant parts, etc.


Ping for the patch series: 
. Thanks!


--
Pierre-Marie de Rodat


Re: [ARM] Correct spelling of references to ARMv6KZ

2015-07-23 Thread Matthew Wahab

On 24/06/15 10:25, Matthew Wahab wrote:

Ping. Attached updated patch which also actually removes "armv6zk" from
doc/invoke.texi.

Also, retested:
- arm-none-linux-gnueabihf native bootstrap and cross-compiled make checck.
- arm-none-eabi: cross-compiled make check.


gcc/
2015-07-23  Matthew Wahab  

* config/arm/arm-arches.def: Add "armv6kz". Replace 6ZK with 6KZ
and FL_FOR_ARCH6ZK with FL_FOR_ARCH6KZ.
* config/arm/arm-c.c (arm_cpu_builtins): Emit "__ARM_ARCH_6ZK__"
for armv6kz targets.
* config/arm/arm-cores.def: Replace 6ZK with 6KZ.
* config/arm/arm-protos.h (FL_ARCH6KZ): New.
(FL_FOR_ARCH6ZK): Remove.
(FL_FOR_ARCH6KZ): New.
(arm_arch6zk): New declaration.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (arm_arch6kz): New.
(arm_option_override): Set arm_arch6kz.
* config/arm/arm.h (BASE_ARCH_6ZK): Rename to BASE_ARCH_6KZ.
* config/arm/driver-arm.c: Add "armv6kz".
* doc/invoke.texi: Replace "armv6zk" with "armv6kz".



Hello,

GCC supports ARM architecture ARMv6KZ but refers to it as ARMv6ZK. This is made
visible by the command line option -march=armv6zk and by the predefined macro
__ARM_ARCH_6ZK__.

This patch corrects the spelling internally and adds -march=armv6kz. To preserve
existing behaviour, -march=armv6zk is kept as an alias of -march=armv6kz and
both __ARM_ARCH_6KZ__ and __ARM_ARCH_6ZK__ macros are defined for the
architecture.

Use of -march=arm6kz will need to wait for binutils to be updated, a patch has
been submitted (https://sourceware.org/ml/binutils/2015-06/msg00236.html). Use
of the existing spelling, -march=arm6zk, still works with current binutils.

Tested arm-none-linux-gnueabihf with check-gcc.

Ok for trunk?
Matthew

gcc/
2015-15-24  Matthew Wahab  

* config/arm/arm-arches.def: Add "armv6kz". Replace 6ZK with 6KZ
and FL_FOR_ARCH6ZK with FL_FOR_ARCH6KZ.
* config/arm/arm-c.c (arm_cpu_builtins): Emit "__ARM_ARCH_6ZK__"
for armv6kz targets.
* config/arm/arm-cores.def: Replace 6ZK with 6KZ.
* config/arm/arm-protos.h (FL_ARCH6KZ): New.
(FL_FOR_ARCH6ZK): Remove.
(FL_FOR_ARCH6KZ): New.
(arm_arch6zk): New declaration.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (arm_arch6kz): New.
(arm_option_override): Set arm_arch6kz.
* config/arm/arm.h (BASE_ARCH_6ZK): Rename to BASE_ARCH_6KZ.
* config/arm/driver-arm.c: Add "armv6kz".
  * doc/invoke.texi: Replace "armv6zk" with "armv6kz" and
"armv6zkt2" with "armv6kzt2".



diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index 840c1ff..3dafaa5 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -44,7 +44,8 @@ ARM_ARCH("armv6",   arm1136js,  6,   FL_CO_PROC | FL_FOR_ARCH6)
 ARM_ARCH("armv6j",  arm1136js,  6J,  FL_CO_PROC | FL_FOR_ARCH6J)
 ARM_ARCH("armv6k",  mpcore,	6K,  FL_CO_PROC | FL_FOR_ARCH6K)
 ARM_ARCH("armv6z",  arm1176jzs, 6Z,  FL_CO_PROC | FL_FOR_ARCH6Z)
-ARM_ARCH("armv6zk", arm1176jzs, 6ZK, FL_CO_PROC | FL_FOR_ARCH6ZK)
+ARM_ARCH("armv6kz", arm1176jzs, 6KZ, FL_CO_PROC | FL_FOR_ARCH6KZ)
+ARM_ARCH("armv6zk", arm1176jzs, 6KZ, FL_CO_PROC | FL_FOR_ARCH6KZ)
 ARM_ARCH("armv6t2", arm1156t2s, 6T2, FL_CO_PROC | FL_FOR_ARCH6T2)
 ARM_ARCH("armv6-m", cortexm1,	6M,			  FL_FOR_ARCH6M)
 ARM_ARCH("armv6s-m", cortexm1,	6M,			  FL_FOR_ARCH6M)
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 297995b..9bf3973 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -167,6 +167,11 @@ arm_cpu_builtins (struct cpp_reader* pfile, int flags)
 }
   if (arm_arch_iwmmxt2)
 builtin_define ("__IWMMXT2__");
+  /* ARMv6KZ was originally identified as the misspelled __ARM_ARCH_6ZK__.  To
+ preserve the existing behaviour, the misspelled feature macro must still be
+ defined.  */
+  if (arm_arch6kz)
+builtin_define ("__ARM_ARCH_6ZK__");
   if (TARGET_AAPCS_BASED)
 {
   if (arm_pcs_default == ARM_PCS_AAPCS_VFP)
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 103c314..9d47fcf 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -125,8 +125,8 @@ ARM_CORE("arm1026ej-s",	arm1026ejs, arm1026ejs,	5TEJ, FL_LDSCHED, 9e)
 /* V6 Architecture Processors */
 ARM_CORE("arm1136j-s",		arm1136js, arm1136js,		6J,  FL_LDSCHED, 9e)
 ARM_CORE("arm1136jf-s",		arm1136jfs, arm1136jfs,		6J,  FL_LDSCHED | FL_VFPV2, 9e)
-ARM_CORE("arm1176jz-s",		arm1176jzs, arm1176jzs,		6ZK, FL_LDSCHED, 9e)
-ARM_CORE("arm1176jzf-s",	arm1176jzfs, arm1176jzfs,	6ZK, FL_LDSCHED | FL_VFPV2, 9e)
+ARM_CORE("arm1176jz-s",		arm1176jzs, arm1176jzs,		6KZ, FL_LDSCHED, 9e)
+ARM_CORE("arm1176jzf-s",	arm1176jzfs, arm1176jzfs,	6KZ, FL_LDSCHED | FL_VFPV2, 9e)
 ARM_CORE("mpcorenovfp",		mpcorenovf

Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 7:33 PM, Alexandre Oliva  wrote:
> On Jul 21, 2015, Richard Biener  wrote:
>
>> On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva  wrote:
>>> + if (cfun->gimple_df)
>
>> If the cfun->gimple_df check is to decide whether this is a call or a 
>> function
>> then no, this can't work reliably.  What is this test for else?
>
> It turns out it's not call or function, as I thought at first, but
> gimplifying or expanding the function.  split_complex_args is not used
> for calls.  So the above might actually work (minus the misleading
> comments I wrote), and I think it's cleaner than adding a bool
> expanding_p arg to split_complex_args and
> assign_parms_augmented_arg_list, called from gimplify_parameters (during
> gimplification of a function) and assign_parms (during its expansion).
> Do you agree, or would you prefer the explicit argument?

Hmm, ok.  Does using

   if (currently_expanding_to_rtl)

work?  I think it's slightly more descriptive.

Ok with that change.

Thanks,
Richard.

> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] Remove unused get_current_pass_name

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 8:37 PM, Bernd Edlinger
 wrote:
> Hi,
>
>
> I noticed recently that tree-pass.h contains a declaration of 
> get_current_pass_name,
> but this function is not defined, and where ever we need the current pass 
> name,
> we simply use current_pass->name.  So I would like to remove that declaration.
>
>
> Boot-strapped and regression-tested on x86-64-linux-gnu.
> OK for trunk?

Ok.

Richard.

>
> Thanks
> Bernd.
>


Re: [ARM] Correct spelling of references to ARMv6KZ

2015-07-23 Thread Kyrill Tkachov

Hi Matthew,

On 23/07/15 11:54, Matthew Wahab wrote:

On 24/06/15 10:25, Matthew Wahab wrote:

Ping. Attached updated patch which also actually removes "armv6zk" from
doc/invoke.texi.

Also, retested:
- arm-none-linux-gnueabihf native bootstrap and cross-compiled make checck.
- arm-none-eabi: cross-compiled make check.


gcc/
2015-07-23  Matthew Wahab  

* config/arm/arm-arches.def: Add "armv6kz". Replace 6ZK with 6KZ
and FL_FOR_ARCH6ZK with FL_FOR_ARCH6KZ.
* config/arm/arm-c.c (arm_cpu_builtins): Emit "__ARM_ARCH_6ZK__"
for armv6kz targets.
* config/arm/arm-cores.def: Replace 6ZK with 6KZ.
* config/arm/arm-protos.h (FL_ARCH6KZ): New.
(FL_FOR_ARCH6ZK): Remove.
(FL_FOR_ARCH6KZ): New.
(arm_arch6zk): New declaration.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (arm_arch6kz): New.
(arm_option_override): Set arm_arch6kz.
* config/arm/arm.h (BASE_ARCH_6ZK): Rename to BASE_ARCH_6KZ.
* config/arm/driver-arm.c: Add "armv6kz".
  * doc/invoke.texi: Replace "armv6zk" with "armv6kz".



Hello,

GCC supports ARM architecture ARMv6KZ but refers to it as ARMv6ZK. This is made
visible by the command line option -march=armv6zk and by the predefined macro
__ARM_ARCH_6ZK__.

This patch corrects the spelling internally and adds -march=armv6kz. To preserve
existing behaviour, -march=armv6zk is kept as an alias of -march=armv6kz and
both __ARM_ARCH_6KZ__ and __ARM_ARCH_6ZK__ macros are defined for the
architecture.

Use of -march=arm6kz will need to wait for binutils to be updated, a patch has
been submitted (https://sourceware.org/ml/binutils/2015-06/msg00236.html). Use
of the existing spelling, -march=arm6zk, still works with current binutils.

Tested arm-none-linux-gnueabihf with check-gcc.

Ok for trunk?
Matthew

gcc/
2015-15-24  Matthew Wahab  

* config/arm/arm-arches.def: Add "armv6kz". Replace 6ZK with 6KZ
and FL_FOR_ARCH6ZK with FL_FOR_ARCH6KZ.
* config/arm/arm-c.c (arm_cpu_builtins): Emit "__ARM_ARCH_6ZK__"
for armv6kz targets.
* config/arm/arm-cores.def: Replace 6ZK with 6KZ.
* config/arm/arm-protos.h (FL_ARCH6KZ): New.
(FL_FOR_ARCH6ZK): Remove.
(FL_FOR_ARCH6KZ): New.
(arm_arch6zk): New declaration.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (arm_arch6kz): New.
(arm_option_override): Set arm_arch6kz.
* config/arm/arm.h (BASE_ARCH_6ZK): Rename to BASE_ARCH_6KZ.
* config/arm/driver-arm.c: Add "armv6kz".
   * doc/invoke.texi: Replace "armv6zk" with "armv6kz" and
"armv6zkt2" with "armv6kzt2".



diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index c715bb7..7873606 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -35,6 +35,7 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xb02", "armv6k", "mpcore"},
 {"0xb36", "armv6j", "arm1136j-s"},
 {"0xb56", "armv6t2", "arm1156t2-s"},
+{"0xb76", "armv6kz", "arm1176jz-s"},
 {"0xb76", "armv6zk", "arm1176jz-s"},
 {"0xc05", "armv7-a", "cortex-a5"},
 {"0xc07", "armv7ve", "cortex-a7"},

This table is scanned from beginning to end, checking for the first field.
You introduce a duplicate here, so the second form will never be reached.
I'd suggest removing the wrong spelling from here, but the re-written march 
string
will be passed to the assembler, so if the assembler is old and doesn't support 
the
correct spelling we'll get errors. So it seems like in order to preserve 
backwards
compatibility we don't want to put the correctly spelled entry here :(
But definitely add a comment here mentioning the deliberate oversight.

Kyrill



[PATCH] [PATCH][ARM] Fix pr63210.c testcase.

2015-07-23 Thread Alex Velenko
Hi,

This patch prevents testcase pr63210.c from running with -march=armv4t.
Object size check should be skipped with explicit -march=armv4t, because
expected size is only correct using pop pc instruction which is unsafe for
armv4t. For arm_arch_v5t_ok cases, an explicit -march=armv5t flag is set.

Is patch ok for trunk and fsf-5?

gcc/testsuite

2015-07-23  Alex Velenko  

* gcc.target/arm/pr63210.c (dg-skip-if): Skip armv4t.
(dg-additional-options): Add -march=armv5t if arm_arch_v5t_ok.
---
 gcc/testsuite/gcc.target/arm/pr63210.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/pr63210.c 
b/gcc/testsuite/gcc.target/arm/pr63210.c
index c3ae928..9b63a67 100644
--- a/gcc/testsuite/gcc.target/arm/pr63210.c
+++ b/gcc/testsuite/gcc.target/arm/pr63210.c
@@ -1,6 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-options "-mthumb -Os " }  */
 /* { dg-require-effective-target arm_thumb1_ok } */
+/* { dg-skip-if "do not test on armv4t" { *-*-* } { "-march=armv4t" } } */
+/* { dg-additional-options "-march=armv5t" {target arm_arch_v5t_ok} } */
 
 int foo1 (int c);
 int foo2 (int c);
-- 
1.8.1.2



Re: [PATCH][match.pd] PR middle-end/66915 Restrict A - B -> A + (-B) to non-fixed-point types

2015-07-23 Thread Richard Biener
On Thu, 23 Jul 2015, Kyrill Tkachov wrote:

> 
> On 21/07/15 11:11, Richard Biener wrote:
> > On Tue, 21 Jul 2015, Kyrill Tkachov wrote:
> > 
> > > On 21/07/15 08:24, Richard Biener wrote:
> > > > On Mon, 20 Jul 2015, Kyrill Tkachov wrote:
> > > > 
> > > > > Hi all,
> > > > > 
> > > > > This patch fixes the PR in question which is a miscompilation of
> > > > > gcc.dg/fixed-point/unary.c on arm.
> > > > > It just restricts the A - B -> A + (-B) transformation when the type
> > > > > is
> > > > > fixed-point.
> > > > > 
> > > > > This fixes the testcase for me.
> > > > > Is this the right approach?
> > > > > 
> > > > > Bootstrap and test on arm and x86 running.
> > > > > 
> > > > > Ok if testing is clean?
> > > > Ok, but I think the fold-const.c code has the same issue, no:
> > > > 
> > > > /* A - B -> A + (-B) if B is easily negatable.  */
> > > > if (negate_expr_p (arg1)
> > > > && !TYPE_OVERFLOW_SANITIZED (type)
> > > > && ((FLOAT_TYPE_P (type)
> > > >  /* Avoid this transformation if B is a positive
> > > > REAL_CST.
> > > > */
> > > >  && (TREE_CODE (arg1) != REAL_CST
> > > >  ||  REAL_VALUE_NEGATIVE (TREE_REAL_CST (arg1
> > > > || INTEGRAL_TYPE_P (type)))
> > > >   return fold_build2_loc (loc, PLUS_EXPR, type,
> > > >   fold_convert_loc (loc, type, arg0),
> > > >   fold_convert_loc (loc, type,
> > > > negate_expr (arg1)));
> > > > 
> > > > ah, no.  The above only applies to float-type and integral-types.
> > > > 
> > > > Thus yes, your patch is ok.  Can you double-check the other pattern,
> > > > 
> > > > /* -(A + B) -> (-B) - A.  */
> > > > (simplify
> > > >(negate (plus:c @0 negate_expr_p@1))
> > > >(if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> > > > && !HONOR_SIGNED_ZEROS (element_mode (type)))
> > > > (minus (negate @1) @0)))
> > > > 
> > > > ?
> > > Thanks, committed with r226028.
> > > I can add (FLOAT_TYPE_P (type) || INTEGRAL_TYPE_P (type)) to the
> > > condition.
> > > That would more closely mirror the original logic, right?
> > > That passes x86_64 bootstrap and aarch64 testing looks ok.
> > Yeah, that works for me, too.
> 
> How about this patch then?
> Bootstrapped and tested on x86_64 and aarch64.

Hmm.  The code already pretty much matches the one in fold-const.c.

So what's the actual issue with fixed-point types and
-(A + B) -> -B - A iff negate_expr_p says that B can be
safely negated?

That is, can you add a testcase that fails without the patch?

Thanks
Richard.


Re: [PATCH] Fix PR66952

2015-07-23 Thread Richard Biener
On Thu, 23 Jul 2015, Kyrill Tkachov wrote:

> 
> On 23/07/15 10:02, Andreas Schwab wrote:
> > Richard Biener  writes:
> > 
> > > Index: gcc/testsuite/gcc.dg/torture/pr66952.c
> > > ===
> > > --- gcc/testsuite/gcc.dg/torture/pr66952.c(revision 0)
> > > +++ gcc/testsuite/gcc.dg/torture/pr66952.c(working copy)
> > > @@ -0,0 +1,28 @@
> > > +/* { dg-do run } */
> > > +
> > > +int a = 128, b;
> > > +
> > > +static int
> > > +fn1 (char p1, int p2)
> > > +{
> > > +  return p1 < 0 || p1 > 1 >> p2 ? 0 : p1 << 1;
> > This is broken, p1 can never be < 0.
> 
> Just mark the chars as signed chars?
> The testcase passes for me with that change.

Ah - I always forget to double-check such testcases with both
-fsinged-char and -funsigned-char...

Will fix.

Richard.


Re: [PATCH][match.pd] PR middle-end/66915 Restrict A - B -> A + (-B) to non-fixed-point types

2015-07-23 Thread Kyrill Tkachov


On 23/07/15 12:16, Richard Biener wrote:

On Thu, 23 Jul 2015, Kyrill Tkachov wrote:


On 21/07/15 11:11, Richard Biener wrote:

On Tue, 21 Jul 2015, Kyrill Tkachov wrote:


On 21/07/15 08:24, Richard Biener wrote:

On Mon, 20 Jul 2015, Kyrill Tkachov wrote:


Hi all,

This patch fixes the PR in question which is a miscompilation of
gcc.dg/fixed-point/unary.c on arm.
It just restricts the A - B -> A + (-B) transformation when the type
is
fixed-point.

This fixes the testcase for me.
Is this the right approach?

Bootstrap and test on arm and x86 running.

Ok if testing is clean?

Ok, but I think the fold-const.c code has the same issue, no:

 /* A - B -> A + (-B) if B is easily negatable.  */
 if (negate_expr_p (arg1)
 && !TYPE_OVERFLOW_SANITIZED (type)
 && ((FLOAT_TYPE_P (type)
  /* Avoid this transformation if B is a positive
REAL_CST.
*/
  && (TREE_CODE (arg1) != REAL_CST
  ||  REAL_VALUE_NEGATIVE (TREE_REAL_CST (arg1
 || INTEGRAL_TYPE_P (type)))
   return fold_build2_loc (loc, PLUS_EXPR, type,
   fold_convert_loc (loc, type, arg0),
   fold_convert_loc (loc, type,
 negate_expr (arg1)));

ah, no.  The above only applies to float-type and integral-types.

Thus yes, your patch is ok.  Can you double-check the other pattern,

/* -(A + B) -> (-B) - A.  */
(simplify
(negate (plus:c @0 negate_expr_p@1))
(if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
 && !HONOR_SIGNED_ZEROS (element_mode (type)))
 (minus (negate @1) @0)))

?

Thanks, committed with r226028.
I can add (FLOAT_TYPE_P (type) || INTEGRAL_TYPE_P (type)) to the
condition.
That would more closely mirror the original logic, right?
That passes x86_64 bootstrap and aarch64 testing looks ok.

Yeah, that works for me, too.

How about this patch then?
Bootstrapped and tested on x86_64 and aarch64.

Hmm.  The code already pretty much matches the one in fold-const.c.

So what's the actual issue with fixed-point types and
-(A + B) -> -B - A iff negate_expr_p says that B can be
safely negated?

That is, can you add a testcase that fails without the patch?


I don't have such a testcase.
If negate_expr_p does what we want here, then I suppose it's redundant
and I withdraw the patch.
I'm not very familiar with the fold-const.c code...

Kyrill



Thanks
Richard.





s390-linux fails to build

2015-07-23 Thread Nick Clifton


Hi Helmut, Hi Ulrich, Hi Andreas,

  A toolchain configured as --target=s390-linux currently fails to build
  gcc because of an undefined function:

undefined reference to `s390_host_detect_local_cpu(int, char const**)'
Makefile:1858: recipe for target 'xgcc' failed

  The patch below fixes the problem for me by adding a stub function in
  s390-common.c, but I am not sure if it is the correct solution.
  Please can you advise ?

Cheers
  Nick

Index: gcc/common/config/s390/s390-common.c
===
--- gcc/common/config/s390/s390-common.c(revision 226094)
+++ gcc/common/config/s390/s390-common.c(working copy)
@@ -119,6 +119,14 @@
 }
 }

+const char * s390_host_detect_local_cpu (int, const char **) 
__attribute__((weak));

+const char *
+s390_host_detect_local_cpu (int argc ATTRIBUTE_UNUSED,
+   const char **argv ATTRIBUTE_UNUSED)
+{
+  return NULL;
+}
+
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS (TARGET_DEFAULT)



Re: [Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-07-23 Thread Salvatore Filippone
I agree with Paul that this is orthogonal to the compilation cascade
phenomenon.

In my opinion, putting PRIVATE entities in a module does not make much
sense (yes, I know my example does it, but it is a quick adaptation of
an existing code, not a clean design). If the main MODULE only contains
the public interfaces, then all those PRIVATE entities really belong to
the submodule together with the implementation(s) that use them. Even
though within the submodule they can  not formally have the PRIVATE
attribute, they would still be invisible outside the submodule,
therefore the end result would be the same.

I have not yet enough experience to say whether I am totally comfortable
with submodules as they are; however it seems to me that most of the
doubts voiced so far  depend more on programmer's discipline than on
language facilities.


Salvatore


Il giorno gio, 23/07/2015 alle 10.37 +0200, Paul Richard Thomas ha
scritto:
> Dear Damian,
>
> I do not think that there is any effect on compilation cascades. As
> long as the private part of the module file remains unchanged, it will
> not be recompiled if a descendant submodule is modified. Naturally,
> the size of the module file is increased but, if one is careful, this
> is not a big deal. A gotcha, which I will have to emphasize in the
> documentation occurs if another module file is used and its symbols
> are not exposed by public statements. If there are large numbers of
> symbols this can have a big effect on the size of the module file. I
> noticed this, when examining one of gfortran's testcases where the
> ISO_C_BINDING intrinsic module is used. Generous sprinklings of USE
> ONLYs are required to keep the module file sizes under control.
>
> I am not over enthusiastic about using compilation flags to uphold
> standards either.
>
> Cheers
>
> Paul


Re: s390-linux fails to build

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 01:03:19PM +0100, Nick Clifton wrote:
> Hi Helmut, Hi Ulrich, Hi Andreas,
> 
>   A toolchain configured as --target=s390-linux currently fails to build
>   gcc because of an undefined function:
> 
> undefined reference to `s390_host_detect_local_cpu(int, char const**)'
> Makefile:1858: recipe for target 'xgcc' failed
> 
>   The patch below fixes the problem for me by adding a stub function in
>   s390-common.c, but I am not sure if it is the correct solution.
>   Please can you advise ?

Isn't it better to just follow what other arches do?
E.g. on i?86/x86_64, the EXTRA_SPEC_FUNCTIONS definition is guarded
with
#if defined(__i386__) || defined(__x86_64__)
and similarly on mips:
#if defined(__mips__)
and thus I'd expect s390{,x} should guard it with
#if defined(__s390__) || defined(__s390x__)
or so.

The config.host change also looks wrong, e.g. i?86 or mips have:
  i[34567]86-*-* \
  | x86_64-*-* )
case ${target} in
  i[34567]86-*-* \
  | x86_64-*-* )
host_extra_gcc_objs="driver-i386.o"
host_xmake_file="${host_xmake_file} i386/x-i386"
;;
esac
;;
  mips*-*-linux*)
case ${target} in
  mips*-*-linux*)
host_extra_gcc_objs="driver-native.o"
host_xmake_file="${host_xmake_file} mips/x-native"
  ;;
esac
;;
while s390 has:
  s390-*-* | s390x-*-*)
host_extra_gcc_objs="driver-native.o"
host_xmake_file="${host_xmake_file} s390/x-native"
;;
I bet that is gone break also cross-compilers from s390* to other targets.

Jakub


Re: [PATCH] Fix ubsan tree sharing (PR sanitizer/66908)

2015-07-23 Thread Marek Polacek
On Wed, Jul 22, 2015 at 07:43:23PM +0200, Jakub Jelinek wrote:
> On Wed, Jul 22, 2015 at 07:26:22PM +0200, Marek Polacek wrote:
> > In this testcase we were generating an uninitialized variable when doing
> > -fsanitize=shift,bounds sanitization.  The shift instrumentation is done
> > first; after that, the IR looks like
> > 
> >   res[i] = (m > 31) ? __ubsan (... tab[i] ...) ? 0, ... tab[i] ...;
> > 
> > where tab[i] are identical.  That means that when we instrument the first
> > tab[i] (we shouldn't do this I suppose), the second tab[i] is changed as
> > well as they're shared.  But that doesn't play well with SAVE_EXPRs, because
> > SAVE_EXPR  would only be initialized on one path.  Fixed by unsharing
> > the operands when constructing the ubsan check.  The .gimple diff is in
> > essence just
> > 
> > +  i.2 = i;
> > +  UBSAN_BOUNDS (0B, i.2, 21);
> > -  UBSAN_BOUNDS (0B, i.1, 21);
> > 
> > (Merely not instrumenting __ubsan_* wouldn't help exactly because of the
> > sharing.)
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> That is strange.  I'd have expected you'd want to unshare if you want to use
> the same operand multiple times in the same function, instead of unsharing
> it just in case it is shared with something different.
> 
> So isn't the bug instead that the UBSAN_BOUNDS generating code doesn't
> unshare?  Of course, these two functions use op0 and/or op1 sometimes
> multiple times too and thus they might want to unshare too, but I'd have
> expected in a different spot.

I don't think so: I think I need to unshare op0 and op1 because the shift
instrumentation always duplicates them for the __ubsan_* call.
Doing the unsharing in bounds instrumentation is too late (and none of my
attempts worked).

I can move the unsharing a little bit below, just before creating the
BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS call, but otherwise I have no
better ideas.

So for now I don't have a better patch, sorry.

Marek


[PATCH] Remove fold_cond_expr_cond

2015-07-23 Thread Richard Biener

So I stumbled across fold_cond_expr_cond and traced it back to
pre-GIMPLE and pre-fold_XXX times.  It "folds" all conditions in
a function (but only if they fold to always true/false).  It
does this after CFG construction but before cfg-cleanup.  It
does that to allow cfg-cleanup to be less expensive - but that's
all no longer true as cfg-cleanup basically does the very same
"folding" as fold_cond_expr_cond.

Then we have the other caller from tree_function_versioning
which folds all copied stmts anyway.

And fold_stmt does sth similar (but doesn't throw away non-true/false
results).  It just restricts itself a bit too much, so I removed
the redundant check.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2015-07-23  Richard Biener  

* gimple-fold.c (fold_gimple_cond): Do not require folding
results to pass valid_gimple_rhs_p.
* tree-cfg.h (fold_cond_expr_cond): Remove.
* tree-cfg.c (fold_cond_expr_cond): Likewise.
(make_edges): Do not call it.
* tree-inline.c (tree_function_versioning): Likewise.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 226091)
+++ gcc/gimple-fold.c   (working copy)
@@ -546,7 +546,7 @@ fold_gimple_cond (gcond *stmt)
   if (result)
 {
   STRIP_USELESS_TYPE_CONVERSION (result);
-  if (is_gimple_condexpr (result) && valid_gimple_rhs_p (result))
+  if (is_gimple_condexpr (result))
 {
   gimple_cond_set_condition_from_tree (stmt, result);
   return true;
Index: gcc/tree-cfg.h
===
--- gcc/tree-cfg.h  (revision 226091)
+++ gcc/tree-cfg.h  (working copy)
@@ -31,7 +31,6 @@ extern void gt_pch_nx (edge_def *e, gt_p
 
 extern void init_empty_tree_cfg_for_function (struct function *);
 extern void init_empty_tree_cfg (void);
-extern void fold_cond_expr_cond (void);
 extern void start_recording_case_labels (void);
 extern void end_recording_case_labels (void);
 extern basic_block label_to_block_fn (struct function *, tree);
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 226091)
+++ gcc/tree-cfg.c  (working copy)
@@ -606,48 +606,6 @@ create_bb (void *h, void *e, basic_block
 Edge creation
 ---*/
 
-/* Fold COND_EXPR_COND of each COND_EXPR.  */
-
-void
-fold_cond_expr_cond (void)
-{
-  basic_block bb;
-
-  FOR_EACH_BB_FN (bb, cfun)
-{
-  gimple stmt = last_stmt (bb);
-
-  if (stmt && gimple_code (stmt) == GIMPLE_COND)
-   {
- gcond *cond_stmt = as_a  (stmt);
- location_t loc = gimple_location (stmt);
- tree cond;
- bool zerop, onep;
-
- fold_defer_overflow_warnings ();
- cond = fold_binary_loc (loc, gimple_cond_code (cond_stmt),
- boolean_type_node,
- gimple_cond_lhs (cond_stmt),
- gimple_cond_rhs (cond_stmt));
- if (cond)
-   {
- zerop = integer_zerop (cond);
- onep = integer_onep (cond);
-   }
- else
-   zerop = onep = false;
-
- fold_undefer_overflow_warnings (zerop || onep,
- stmt,
- WARN_STRICT_OVERFLOW_CONDITIONAL);
- if (zerop)
-   gimple_cond_make_false (cond_stmt);
- else if (onep)
-   gimple_cond_make_true (cond_stmt);
-   }
-}
-}
-
 /* If basic block BB has an abnormal edge to a basic block
containing IFN_ABNORMAL_DISPATCHER internal call, return
that the dispatcher's basic block, otherwise return NULL.  */
@@ -1000,9 +958,6 @@ make_edges (void)
   XDELETE (bb_to_omp_idx);
 
   free_omp_regions ();
-
-  /* Fold COND_EXPR_COND of each COND_EXPR.  */
-  fold_cond_expr_cond ();
 }
 
 /* Add SEQ after GSI.  Start new bb after GSI, and created further bbs as
Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c   (revision 226091)
+++ gcc/tree-inline.c   (working copy)
@@ -5847,7 +5847,6 @@ tree_function_versioning (tree old_decl,
 
   fold_marked_statements (0, id.statements_to_fold);
   delete id.statements_to_fold;
-  fold_cond_expr_cond ();
   delete_unreachable_blocks_update_callgraph (&id);
   if (id.dst_node->definition)
 cgraph_edge::rebuild_references ();


Re: s390-linux fails to build

2015-07-23 Thread Ulrich Weigand
Jakub Jelinek wrote:
> On Thu, Jul 23, 2015 at 01:03:19PM +0100, Nick Clifton wrote:
> > Hi Helmut, Hi Ulrich, Hi Andreas,
> > 
> >   A toolchain configured as --target=s390-linux currently fails to build
> >   gcc because of an undefined function:
> > 
> > undefined reference to `s390_host_detect_local_cpu(int, char const**)'
> > Makefile:1858: recipe for target 'xgcc' failed
> > 
> >   The patch below fixes the problem for me by adding a stub function in
> >   s390-common.c, but I am not sure if it is the correct solution.
> >   Please can you advise ?
> 
> Isn't it better to just follow what other arches do?
> E.g. on i?86/x86_64, the EXTRA_SPEC_FUNCTIONS definition is guarded
> with
> #if defined(__i386__) || defined(__x86_64__)
> and similarly on mips:
> #if defined(__mips__)
> and thus I'd expect s390{,x} should guard it with
> #if defined(__s390__) || defined(__s390x__)
> or so.

This is supposed to be fixed by this pending patch:
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01546.html

(which hasn't been applied yet, but probably should be soon ...)

> The config.host change also looks wrong, e.g. i?86 or mips have:
>   i[34567]86-*-* \
>   | x86_64-*-* )
> case ${target} in
>   i[34567]86-*-* \
>   | x86_64-*-* )
> host_extra_gcc_objs="driver-i386.o"
> host_xmake_file="${host_xmake_file} i386/x-i386"
> ;;
> esac
> ;;
>   mips*-*-linux*)
> case ${target} in
>   mips*-*-linux*)
> host_extra_gcc_objs="driver-native.o"
> host_xmake_file="${host_xmake_file} mips/x-native"
>   ;;
> esac
> ;;
> while s390 has:
>   s390-*-* | s390x-*-*)
> host_extra_gcc_objs="driver-native.o"
> host_xmake_file="${host_xmake_file} s390/x-native"
> ;;
> I bet that is gone break also cross-compilers from s390* to other targets.

I think this should be fine on s390.  The problem with i386 is that
the driver-native.c file uses data types only defined by the i386
target files (e.g. enum processor_type).  But on s390, the file does
not any target-specific types and should be fully portable.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH 3/4] S390 -march=native related fixes

2015-07-23 Thread Ulrich Weigand
Dominik Vogt wrote:
> 
> gcc/ChangeLog
>  
>   * config/s390/driver-native.c (s390_host_detect_local_cpu): Handle
>   processor capabilities with -march=native.
>   * config/s390/s390.h (MARCH_MTUNE_NATIVE_SPECS): Likewise.
>   (DRIVER_SELF_SPECS): Likewise.  Join specs for 31 and 64 bit.
>   (S390_TARGET_BITS_STRING): Macro to simplify specs.

This is OK.

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: s390-linux fails to build

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 02:46:43PM +0200, Ulrich Weigand wrote:
> This is supposed to be fixed by this pending patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01546.html

LGTM.

> > The config.host change also looks wrong, e.g. i?86 or mips have:
> >   i[34567]86-*-* \
> >   | x86_64-*-* )
> > case ${target} in
> >   i[34567]86-*-* \
> >   | x86_64-*-* )
> > host_extra_gcc_objs="driver-i386.o"
> > host_xmake_file="${host_xmake_file} i386/x-i386"
> > ;;
> > esac
> > ;;
> >   mips*-*-linux*)
> > case ${target} in
> >   mips*-*-linux*)
> > host_extra_gcc_objs="driver-native.o"
> > host_xmake_file="${host_xmake_file} mips/x-native"
> >   ;;
> > esac
> > ;;
> > while s390 has:
> >   s390-*-* | s390x-*-*)
> > host_extra_gcc_objs="driver-native.o"
> > host_xmake_file="${host_xmake_file} s390/x-native"
> > ;;
> > I bet that is gone break also cross-compilers from s390* to other targets.
> 
> I think this should be fine on s390.  The problem with i386 is that
> the driver-native.c file uses data types only defined by the i386
> target files (e.g. enum processor_type).  But on s390, the file does
> not any target-specific types and should be fully portable.

That hunk means that driver-native.o is added to EXTRA_GCC_OBJS
even say for s390x-*-* -> x86_64-*-* compiler.  While it might compile
there, nothing will use it, so what is it good for?
i?86/x86_64 backend will certainly not reference s390_host_detect_local_cpu
anywhere.

Jakub


Re: s390-linux fails to build

2015-07-23 Thread Ulrich Weigand
Jakub Jelinek wrote:
> On Thu, Jul 23, 2015 at 02:46:43PM +0200, Ulrich Weigand wrote:
> > > I bet that is gone break also cross-compilers from s390* to other targets.
> > 
> > I think this should be fine on s390.  The problem with i386 is that
> > the driver-native.c file uses data types only defined by the i386
> > target files (e.g. enum processor_type).  But on s390, the file does
> > not any target-specific types and should be fully portable.
> 
> That hunk means that driver-native.o is added to EXTRA_GCC_OBJS
> even say for s390x-*-* -> x86_64-*-* compiler.  While it might compile
> there, nothing will use it, so what is it good for?
> i?86/x86_64 backend will certainly not reference s390_host_detect_local_cpu
> anywhere.

Oh, I agree this will not be *used*.  I just wanted to point out that it
will not *break* cross-compilers as is.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: s390-linux fails to build

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 03:03:29PM +0200, Ulrich Weigand wrote:
> Jakub Jelinek wrote:
> > On Thu, Jul 23, 2015 at 02:46:43PM +0200, Ulrich Weigand wrote:
> > > > I bet that is gone break also cross-compilers from s390* to other 
> > > > targets.
> > > 
> > > I think this should be fine on s390.  The problem with i386 is that
> > > the driver-native.c file uses data types only defined by the i386
> > > target files (e.g. enum processor_type).  But on s390, the file does
> > > not any target-specific types and should be fully portable.
> > 
> > That hunk means that driver-native.o is added to EXTRA_GCC_OBJS
> > even say for s390x-*-* -> x86_64-*-* compiler.  While it might compile
> > there, nothing will use it, so what is it good for?
> > i?86/x86_64 backend will certainly not reference s390_host_detect_local_cpu
> > anywhere.
> 
> Oh, I agree this will not be *used*.  I just wanted to point out that it
> will not *break* cross-compilers as is.

I think it would be better for consistency and sanity do what other
targets do, even if it won't break the cross-compilation.
Do you agree?

Jakub


Re: [PR25529] Convert (unsigned t * 2)/2 into unsigned (t & 0x7FFFFFFF)

2015-07-23 Thread Richard Biener
On Thu, Jul 23, 2015 at 5:47 AM, Hurugalawadi, Naveen
 wrote:
>>> so using wi::mask is prefered here.
>
> Thanks for your review and comments.
>
> Please find attached the modified patch as per your comments.
>
> Please let me know if this version is okay?

Ok with adding

/* { dg-require-effective-target int32 } */

to the testcase.

Please omit the

+/* Simplify (t * 2)/2 ->  t.  */
+(simplify
+ (exact_div (mult @0 INTEGER_CST@1) @1)
+ (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+  @0))

pattern.  As a followup extend it - it shoudl also work for non-INTEGER_CST
divisors and it should work for any kind of division, not just exact_div.  The
key here is TYPE_OVERFLOW_UNDEFINED.  I believe you should
find the equivalent operation in extract_trunc_div_1.

Richard.

> Thanks,
> Naveen
>
> 2015-07-22  Naveen H.S  
>
> gcc/testsuite/ChangeLog:
>  PR middle-end/25529
>  * gcc.dg/pr25529.c: New test.
>
> gcc/ChangeLog:
>  PR middle-end/25529
>  * match.pd (exact_div (mult @0 INTEGER_CST@1) @1) : New 
> simplifier.
>  (trunc_div (mult @0 integer_pow2p@1) @1) : New simplifier.


Re: [PR25530] Convert (unsigned t / 2) * 2 into (unsigned t & ~1)

2015-07-23 Thread Richard Biener
On Thu, Jul 23, 2015 at 5:49 AM, Hurugalawadi, Naveen
 wrote:
>>> Your previous patch correctly restricted this to unsigned types.
>
> Thanks for your review and comments.
>
> Please find attached the modified patch as per your comments.
>
> Please let me know if this version is okay?

Ok.

Thanks,
Richard.

> Thanks,
> Naveen
>
> 2015-07-22  Naveen H.S  
>
> gcc/testsuite/ChangeLog:
>  PR middle-end/25530
>  * gcc.dg/pr25530.c: New test.
>
>  gcc/ChangeLog:
>  PR middle-end/25530
>  * match.pd (mult (trunc_div @0 integer_pow2p@1) @1) : New simplifier.


Re: s390-linux fails to build

2015-07-23 Thread Dominik Vogt
On Thu, Jul 23, 2015 at 03:09:46PM +0200, Jakub Jelinek wrote:
> On Thu, Jul 23, 2015 at 03:03:29PM +0200, Ulrich Weigand wrote:
> > Oh, I agree this will not be *used*.  I just wanted to point out that it
> > will not *break* cross-compilers as is.
> 
> I think it would be better for consistency and sanity do what other
> targets do, even if it won't break the cross-compilation.
> Do you agree?

It's certainly better to not compile a file with code that was
written to run on a different platform.  I'll make a patch in a
couple of days.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] Fix ubsan tree sharing (PR sanitizer/66908)

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 02:20:51PM +0200, Marek Polacek wrote:
> > So isn't the bug instead that the UBSAN_BOUNDS generating code doesn't
> > unshare?  Of course, these two functions use op0 and/or op1 sometimes
> > multiple times too and thus they might want to unshare too, but I'd have
> > expected in a different spot.
> 
> I don't think so: I think I need to unshare op0 and op1 because the shift
> instrumentation always duplicates them for the __ubsan_* call.
> Doing the unsharing in bounds instrumentation is too late (and none of my
> attempts worked).
> 
> I can move the unsharing a little bit below, just before creating the
> BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS call, but otherwise I have no
> better ideas.
> 
> So for now I don't have a better patch, sorry.

Actually looking at the callers, your patch is indeed right.  They pass
this op0/op1 to these functions and then use the values as well elsewhere
(in the actual operation).

Whether it is also needed more times in the body of the functions
(if you use op0 and/or op1 in the functions more than once in the generated
expressions) remains to be seen.  Of course during gimplification we unshare
everything shared in the function body, but that might be too late.
E.g. in ubsan_instrument_division at
  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
build_int_cst (type, -1));
  x = fold_build2 (EQ_EXPR, boolean_type_node, op0,
   TYPE_MIN_VALUE (type));
you already know op1 has been used once in t, while op0 has not been used
yet, so unshare_expr (op1) might be desirable.  And you know that unless
if (integer_zerop (t)) op0 will be used afterwards, so you might as well
unshare_expr (op0) too.  Then you use
  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
unconditionally.  And finally there is:
  tt = build_call_expr_loc (loc, tt, 3, data, ubsan_encode_value (op0),
ubsan_encode_value (op1));
which you might want to unshare_expr first (both).

So, please go ahead with your patch and as follow-up, see if further
unshare_exprs don't need to be added.

Jakub


Re: [PING][PATCH, 1/2] Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch

2015-07-23 Thread Richard Biener
On Mon, 20 Jul 2015, Tom de Vries wrote:

> On 09/07/15 13:04, Richard Biener wrote:
> > On Thu, 9 Jul 2015, Tom de Vries wrote:
> > 
> > > On 07/07/15 17:58, Tom de Vries wrote:
> > > > > If you can
> > > > > handle one exit edge I also can't see the difficulty in handling
> > > > > all exit edges.
> > > > > 
> > > > 
> > > > Agreed, that doesn't look to complicated. I could call
> > > > rewrite_virtuals_into_loop_closed_ssa for all loops in
> > > > rewrite_virtuals_into_loop_closed_ssa, to get non-single_dom_exit loops
> > > > exercising the code, and fix what breaks.
> > > 
> > > Hmm, I just realised, it's more complicated than I thought.
> > > 
> > > In loops with single_dom_exit, the exit dominates the uses outside the
> > > loop,
> > > so I can replace the uses of the def with the uses of the exit phi result.
> > > 
> > > If !single_dom_exit, the exit(s) may not dominate all uses, and I need to
> > > insert non-loop-exit phi nodes to deal with that.
> > 
> > Yes.  This is why I originally suggested to amend the regular
> > loop-close-SSA rewriting code.
> > 
> 
> This patch renames rewrite_into_loop_closed_ssa to
> rewrite_into_loop_closed_ssa_1, and adds arguments:
> - a loop argument, to limit the defs for which the uses are
>   rewritten
> - a use_flags argument, to determine the type of uses rewritten:
>   SSA_OP_USE/SSA_OP_VIRTUAL_USES/SSA_OP_ALL_USES
> 
> The original rewrite_into_loop_closed_ssa is reimplemented using
> rewrite_into_loop_closed_ssa_1.
> 
> And the !single_dom_exit case of rewrite_into_loop_closed_ssa is implemented
> using rewrite_into_loop_closed_ssa_1. [ The patch was tested as attached,
> always using rewrite_into_loop_closed_ssa_1, otherwise it would not be
> triggered. ]
> 
> Bootstrapped and reg-tested on x86_64.
> 
> Is this sort of what you had in mind?

Yes.  New functions need a comment and instead of iterating over
all function BBs and checking bb->loop_father please use
get_loop_body ().

Of course in the final version #if 0 stuff shouldn't remain.  What's
the cost difference of removing the single_dom_exit special-case?

Thanks,
Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)


Re: [PATCH] Enable reductions without fassociative-math in graphite

2015-07-23 Thread Richard Biener
On Wed, 22 Jul 2015, Tom de Vries wrote:

> Hi,
> 
> this patch allows non-float reductions to be detected by graphite, independent
> of whether fassociative-math (which only has effect for float operations) is
> set.
> 
> Currently bootstrapping and reg-testing on x86_64.
> 
> OK for trunk?

Ok

> Thanks,
> - Tom
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)


Re: [PATCH] Fix ubsan tree sharing (PR sanitizer/66908)

2015-07-23 Thread Marek Polacek
On Thu, Jul 23, 2015 at 02:39:05PM +0200, Jakub Jelinek wrote:
> On Thu, Jul 23, 2015 at 02:20:51PM +0200, Marek Polacek wrote:
> > > So isn't the bug instead that the UBSAN_BOUNDS generating code doesn't
> > > unshare?  Of course, these two functions use op0 and/or op1 sometimes
> > > multiple times too and thus they might want to unshare too, but I'd have
> > > expected in a different spot.
> > 
> > I don't think so: I think I need to unshare op0 and op1 because the shift
> > instrumentation always duplicates them for the __ubsan_* call.
> > Doing the unsharing in bounds instrumentation is too late (and none of my
> > attempts worked).
> > 
> > I can move the unsharing a little bit below, just before creating the
> > BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS call, but otherwise I have no
> > better ideas.
> > 
> > So for now I don't have a better patch, sorry.
> 
> Actually looking at the callers, your patch is indeed right.  They pass
> this op0/op1 to these functions and then use the values as well elsewhere
> (in the actual operation).
> 
> Whether it is also needed more times in the body of the functions
> (if you use op0 and/or op1 in the functions more than once in the generated
> expressions) remains to be seen.  Of course during gimplification we unshare
> everything shared in the function body, but that might be too late.
> E.g. in ubsan_instrument_division at
>   tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
> build_int_cst (type, -1));
>   x = fold_build2 (EQ_EXPR, boolean_type_node, op0,
>TYPE_MIN_VALUE (type));
> you already know op1 has been used once in t, while op0 has not been used
> yet, so unshare_expr (op1) might be desirable.  And you know that unless
> if (integer_zerop (t)) op0 will be used afterwards, so you might as well
> unshare_expr (op0) too.  Then you use
>   t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
> unconditionally.  And finally there is:
>   tt = build_call_expr_loc (loc, tt, 3, data, ubsan_encode_value (op0),
> ubsan_encode_value (op1));
> which you might want to unshare_expr first (both).
> 
> So, please go ahead with your patch and as follow-up, see if further
> unshare_exprs don't need to be added.

Thanks.  Ok, I will; and I hope that the follow up will fix the remaining
issue Maxim is seeing on ARM.

Marek


Re: [PATCH PR66388]Compute use with cand of smaller precision by further exercising scev overflow info.

2015-07-23 Thread Richard Biener
On Fri, Jul 17, 2015 at 8:27 AM, Bin Cheng  wrote:
> Hi,
> This patch is to fix PR66388.  It's an old issue but recently became worse
> after my scev overflow change.  IVOPT now can only compute iv use with
> candidate which has at least same type precision.  See below code:
>
>   if (TYPE_PRECISION (utype) > TYPE_PRECISION (ctype))
> {
>   /* We do not have a precision to express the values of use.  */
>   return infinite_cost;
> }
>
> This is not always true.  It's possible to compute with a candidate of
> smaller precision if it has enough stepping periods to express the iv use.
> Just as code in iv_elimination.  Well, since now we have iv no_overflow
> information, we can use that to prove it's safe.  Actually I am thinking
> about improving iv elimination with overflow information too.  So this patch
> relaxes the constraint to allow computation of uses with smaller precision
> candidates.
>
> Benchmark data shows several cases in spec2k6 are obviously improved on
> aarch64:
> 400.perlbench2.32%
> 445.gobmk0.86%
> 456.hmmer11.72%
> 464.h264ref  1.93%
> 473.astar0.75%
> 433.milc -1.49%
> 436.cactusADM6.61%
> 444.namd -0.76%
>
> I looked into assembly code of 456.hmmer&436.cactusADM, and can confirm hot
> loops are reduced.  Also perf data could confirm the improvement in
> 456.hmmer.
> I looked into 433.milc and found most hot functions are not affected by this
> patch.  But I do observe two kinds of regressions described as below:
> A)  For some loops, auto-increment addressing mode is generated before this
> patch, but "base + index< this too much because auto-increment support in IVO hasn't been enabled on
> AArch64 yet. On the contrary, we should worry that auto-increment support is
> too aggressive in IVO, resulting in auto-increment addressing mode generated
> where it shouldn't. I suspect the regression we monitored before is caused
> by such kind of reason.
> B) This patch enables computation of 64 bits address iv use with 32 bits biv
> candidate.  So there will be a sign extension before the candidate can be
> used in memory reference as an index. I already increased the cost by 2 for
> such biv candidates but there still be some peculiar cases... Decreasing
> cost in determine_iv_cost for biv candidates makes this worse.  It does that
> to make debugging simpler, nothing to do with performance.
>
> Bootstrap and test on x86_64.  It fixes failure of pr49781-1.c.
> Unfortunately, it introduces new failure of
> g++.dg/debug/dwarf2/deallocator.C.  I looked into the test and found with
> this patch, the loop is transformed into a shape that can be later
> eliminated(because it can be proved never loop back?).  We can further
> discuss if it's this patch's problem or the case should be tuned.
> Also bootstrap and test on aarch64.
>
> So what's your opinion?

Looks sensible, but the deallocator.C fail looks odd.  I presume that
i + j is simplified in a way that either the first or the second iteration
must exit the loop via the return and thus the scan for deallocator.C:34
fails?  How does this happen - I can only see this happen if we unroll
the loop and then run into VRP.  So does IVOPTs now affect non-loop
code as well?  Ah, at the moment we use an IV that runs backward.

Still curious if this isn't a wrong-code issue...

Richard.

> Thanks,
> bin
>
>
> 2015-07-16  Bin Cheng  
>
> PR tree-optimization/66388
> * tree-ssa-loop-ivopts.c (dump_iv): Dump no_overflow info.
> (add_candidate_1): New parameter.  Use unsigned type when iv
> overflows.  Pass no_overflow to alloc_iv.
> (add_autoinc_candidates, add_candidate): New parameter.
> Pass no_overflow to add_candidate_1.
> (add_candidate): Ditto.
> (add_iv_candidate_for_biv, add_iv_candidate_for_use): Pass iv's
> no_overflow info to add_candidate and add_candidate_1.
> (get_computation_aff, get_computation_cost_at): Handle candidate
> with smaller precision than iv use.
>
> gcc/testsuite/ChangeLog
> 2015-07-16  Bin Cheng  
>
> PR tree-optimization/66388
> * gcc.dg/tree-ssa/pr66388.c: New test.
>


[PATCH] rs6000: Add dot forms of and3_2insn

2015-07-23 Thread Segher Boessenkool
This does one of the TODOs I added: it adds dot forms of the ANDs done
with two machine insns.  It uses a new helper function (rs6000_emit_dot_insn)
that probably can be used more often; it is quite general in any case.

Bootstrapped and tested on powerpc64-linux, using {-m32,-m32/-mpowerpc64,
-m64,-m64/-mlra}; no regressions.  Code size on both 32-bit and 64-bit
improves.

Is this okay for trunk?


Segher


2015-07-23  Segher Boessenkool  

PR target/66217
* config/rs6000/rs6000-protos.h (rs6000_emit_2insn_and): Change
prototype.
* config/rs6000/rs6000.c (rs6000_emit_dot_insn): New function.
(rs6000_emit_2insn_and): Handle dot forms.
* config/rs6000/rs6000.md (and3): Adjust.
(*and3_2insn): Remove TODO.  Adjust.  Add "type" attr.
(*and3_2insn_dot, *and3_2insn_dot2): New.

---
 gcc/config/rs6000/rs6000-protos.h |  2 +-
 gcc/config/rs6000/rs6000.c| 55 +++---
 gcc/config/rs6000/rs6000.md   | 56 ---
 3 files changed, 98 insertions(+), 15 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 30a7128..f5d3476 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -77,7 +77,7 @@ extern const char *rs6000_insn_for_and_mask (machine_mode, 
rtx *, bool);
 extern const char *rs6000_insn_for_shift_mask (machine_mode, rtx *, bool);
 extern const char *rs6000_insn_for_insert_mask (machine_mode, rtx *, bool);
 extern bool rs6000_is_valid_2insn_and (rtx, machine_mode);
-extern void rs6000_emit_2insn_and (machine_mode, rtx *, bool, bool);
+extern void rs6000_emit_2insn_and (machine_mode, rtx *, bool, int);
 extern int registers_ok_for_quad_peep (rtx, rtx);
 extern int mems_ok_for_quad_peep (rtx, rtx);
 extern bool gpr_or_gpr_p (rtx, rtx);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1fb1f32..fe8ce71 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16736,20 +16736,54 @@ rs6000_is_valid_2insn_and (rtx c, machine_mode mode)
   return rs6000_is_valid_and_mask (GEN_INT (val + bit3 - bit2), mode);
 }
 
+/* Emit a potentially record-form instruction, setting DST from SRC.
+   If DOT is 0, that is all; otherwise, set CCREG to the result of the
+   signed comparison of DST with zero.  If DOT is 1, the generated RTL
+   doesn't care about the DST result; if DOT is 2, it does.  If CCREG
+   is CR0 do a single dot insn (as a PARALLEL); otherwise, do a SET and
+   a separate COMPARE.  */
+
+static void
+rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg)
+{
+  if (dot == 0)
+{
+  emit_move_insn (dst, src);
+  return;
+}
+
+  if (cc_reg_not_cr0_operand (ccreg, CCmode))
+{
+  emit_move_insn (dst, src);
+  emit_move_insn (ccreg, gen_rtx_COMPARE (CCmode, dst, const0_rtx));
+  return;
+}
+
+  rtx ccset = gen_rtx_SET (ccreg, gen_rtx_COMPARE (CCmode, src, const0_rtx));
+  if (dot == 1)
+{
+  rtx clobber = gen_rtx_CLOBBER (VOIDmode, dst);
+  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, ccset, clobber)));
+}
+  else
+{
+  rtx set = gen_rtx_SET (dst, src);
+  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, ccset, set)));
+}
+}
+
 /* Emit the two insns to do an AND in mode MODE, with operands OPERANDS.
If EXPAND is true, split rotate-and-mask instructions we generate to
their constituent parts as well (this is used during expand); if DOT
-   is true, make the last insn a record-form instruction.  */
+   is 1, make the last insn a record-form instruction clobbering the
+   destination GPR and setting the CC reg (from operands[3]); if 2, set
+   that GPR as well as the CC reg.  */
 
 void
-rs6000_emit_2insn_and (machine_mode mode, rtx *operands, bool expand, bool dot)
+rs6000_emit_2insn_and (machine_mode mode, rtx *operands, bool expand, int dot)
 {
   gcc_assert (!(expand && dot));
 
-  /* We do not actually handle record form yet.  */
-  if (dot)
-gcc_unreachable ();
-
   unsigned HOST_WIDE_INT val = INTVAL (operands[2]);
 
   /* If it is one stretch of ones, it is DImode; shift left, mask, then
@@ -16774,7 +16808,8 @@ rs6000_emit_2insn_and (machine_mode mode, rtx 
*operands, bool expand, bool dot)
  rtx tmp = gen_rtx_ASHIFT (mode, operands[1], GEN_INT (shift));
  tmp = gen_rtx_AND (mode, tmp, GEN_INT (val << shift));
  emit_move_insn (operands[0], tmp);
- emit_insn (gen_lshrdi3 (operands[0], operands[0], GEN_INT (shift)));
+ tmp = gen_rtx_LSHIFTRT (mode, operands[0], GEN_INT (shift));
+ rs6000_emit_dot_insn (operands[0], tmp, dot, dot ? operands[3] : 0);
}
   return;
 }
@@ -16800,7 +16835,7 @@ rs6000_emit_2insn_and (machine_mode mode, rtx 
*operands, bool expand, bool dot)
   rtx tmp = gen_rtx_AND (mode, operands[1], GEN_INT (mask1));
   emit_move_insn (reg, tmp);
   tmp = gen_rtx_AND (mode

Re: [PATCH] rs6000: Add dot forms of and3_2insn

2015-07-23 Thread David Edelsohn
On Thu, Jul 23, 2015 at 10:21 AM, Segher Boessenkool
 wrote:
> This does one of the TODOs I added: it adds dot forms of the ANDs done
> with two machine insns.  It uses a new helper function (rs6000_emit_dot_insn)
> that probably can be used more often; it is quite general in any case.
>
> Bootstrapped and tested on powerpc64-linux, using {-m32,-m32/-mpowerpc64,
> -m64,-m64/-mlra}; no regressions.  Code size on both 32-bit and 64-bit
> improves.
>
> Is this okay for trunk?
>
>
> Segher
>
>
> 2015-07-23  Segher Boessenkool  
>
> PR target/66217
> * config/rs6000/rs6000-protos.h (rs6000_emit_2insn_and): Change
> prototype.
> * config/rs6000/rs6000.c (rs6000_emit_dot_insn): New function.
> (rs6000_emit_2insn_and): Handle dot forms.
> * config/rs6000/rs6000.md (and3): Adjust.
> (*and3_2insn): Remove TODO.  Adjust.  Add "type" attr.
> (*and3_2insn_dot, *and3_2insn_dot2): New.

Okay.

Thanks, David


[PATCH][17/n] Remove GENERIC stmt combining from SCCVN

2015-07-23 Thread Richard Biener

This moves address-of-decl simplifications.  The second pattern is from

  if (integer_zerop (arg1)
  && tree_expr_nonzero_p (arg0))
{
  tree res = constant_boolean_node (code==NE_EXPR, type);
  return omit_one_operand_loc (loc, type, res, arg0);
}

which I didn't want to move in its full extent.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2015-07-23  Richard Biener  

* generic-match-head.c: Include cgraph.h.
* gimple-match-head.c: Likewise.
* tree-ssa-sccvn.c (free_scc_vn): Guard against newly created
SSA names.
* fold-const.c (fold_binary_loc): Move &A ==/!= &B simplification...
* match.pd: ...to a pattern here.  Add &A ==/!= 0 simplification
pattern.

Index: gcc/generic-match-head.c
===
--- gcc/generic-match-head.c(revision 226110)
+++ gcc/generic-match-head.c(working copy)
@@ -46,8 +46,10 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "dumpfile.h"
 #include "target.h"
+#include "cgraph.h"
 #include "generic-match.h"
 
+
 /* Routine to determine if the types T1 and T2 are effectively
the same for GENERIC.  If T1 or T2 is not a type, the test
applies to their TREE_TYPE.  */
Index: gcc/gimple-match-head.c
===
--- gcc/gimple-match-head.c (revision 226110)
+++ gcc/gimple-match-head.c (working copy)
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "dumpfile.h"
 #include "target.h"
+#include "cgraph.h"
 #include "gimple-match.h"
 
 
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 226110)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -4223,6 +4223,8 @@ free_scc_vn (void)
 {
   tree name = ssa_name (i);
   if (name
+ && SSA_NAME_VERSION (name) < vn_ssa_aux_table.length ()
+ && vn_ssa_aux_table[SSA_NAME_VERSION (name)]
  && VN_INFO (name)->needs_insertion)
release_ssa_name (name);
 }
Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 226110)
+++ gcc/fold-const.c(working copy)
@@ -11082,29 +11082,6 @@ fold_binary_loc (location_t loc,
  && code == NE_EXPR)
 return non_lvalue_loc (loc, fold_convert_loc (loc, type, arg0));
 
-  /* If this is an equality comparison of the address of two non-weak,
-unaliased symbols neither of which are extern (since we do not
-have access to attributes for externs), then we know the result.  */
-  if (TREE_CODE (arg0) == ADDR_EXPR
- && DECL_P (TREE_OPERAND (arg0, 0))
- && TREE_CODE (arg1) == ADDR_EXPR
- && DECL_P (TREE_OPERAND (arg1, 0)))
-   {
- int equal;
-
- if (decl_in_symtab_p (TREE_OPERAND (arg0, 0))
- && decl_in_symtab_p (TREE_OPERAND (arg1, 0)))
-   equal = symtab_node::get_create (TREE_OPERAND (arg0, 0))
-   ->equal_address_to (symtab_node::get_create
- (TREE_OPERAND (arg1, 0)));
- else
-   equal = TREE_OPERAND (arg0, 0) == TREE_OPERAND (arg1, 0);
- if (equal != 2)
-   return constant_boolean_node (equal
- ? code == EQ_EXPR : code != EQ_EXPR,
- type);
-   }
-
   /* Similarly for a BIT_XOR_EXPR;  X ^ C1 == C2 is X == (C1 ^ C2).  */
   if (TREE_CODE (arg0) == BIT_XOR_EXPR
  && TREE_CODE (arg1) == INTEGER_CST
Index: gcc/match.pd
===
--- gcc/match.pd(revision 226110)
+++ gcc/match.pd(working copy)
@@ -1754,7 +1754,28 @@ (define_operator_list CBRT BUILT_IN_CBRT
  (simplify
   (cmp (convert?@3 (bit_xor @0 INTEGER_CST@1)) INTEGER_CST@2)
   (if (tree_nop_conversion_p (TREE_TYPE (@3), TREE_TYPE (@0)))
-   (cmp @0 (bit_xor @1 (convert @2))
+   (cmp @0 (bit_xor @1 (convert @2)
+   
+ /* If this is an equality comparison of the address of two non-weak,
+unaliased symbols neither of which are extern (since we do not
+have access to attributes for externs), then we know the result.  */
+ (simplify
+  (cmp (convert? addr@0) (convert? addr@1))
+  (if (decl_in_symtab_p (TREE_OPERAND (@0, 0))
+   && decl_in_symtab_p (TREE_OPERAND (@1, 0)))
+   (with
+{
+  int equal = symtab_node::get_create (TREE_OPERAND (@0, 0))
+   ->equal_address_to (symtab_node::get_create (TREE_OPERAND (@1, 0)));
+}
+(if (equal != 2)
+ { constant_boolean_node (equal ? cmp == EQ_EXPR : cmp != EQ_EXPR, type); 
}
+
+ (simplify
+  (cmp (convert? addr@0) integer_zerop)
+  (if (tree_single_nonzero_warnv_p (@0, NULL))
+   { constant_boolean_node (cmp == NE_E

Re: [PATCH] fix in-tree-binutils builds

2015-07-23 Thread H.J. Lu
On Fri, Jul 17, 2015 at 7:43 AM, H.J. Lu  wrote:
> On Wed, Jul 15, 2015 at 9:47 AM, Mike Stump  wrote:
>> On Jul 15, 2015, at 9:07 AM, H.J. Lu  wrote:
>>> On Wed, Jul 15, 2015 at 1:03 AM, Jan Beulich  wrote:

 - $gcc_cv_as_gas_srcdir/configure.in \
 + $gcc_cv_as_gas_srcdir/configure.[ai][cn] \
  $gcc_cv_as_gas_srcdir/Makefile.in ; do
   gcc_cv_gas_version=`sed -n -e 's/^[[ 
 ]]*VERSION=[[^0-9A-Za-z_]]*\([[0-9]]*\.[[0-9]]*.*\)/VERSION=\1/p' < $f`
>>>
>>> How portable is [ai][cn]?
>>
>> Should be portable enough.
>
> Are there any objections to this patch?

Can we check in this patch?  I tested on Linux/x86.

Thanks.

-- 
H.J.


Re: [PATCH] Unswitching outer loops.

2015-07-23 Thread Yuri Rumyantsev
Hi Richard,

I checked that both test-cases from 23855 are sucessfully unswitched
by proposed patch. I understand that it does not catch deeper loop
nest as
   for (i=0; i<10; i++)
 for (j=0;j:
> On Fri, Jul 10, 2015 at 12:02 PM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> Here is presented simple transformation which tries to hoist out of
>> outer-loop a check on zero trip count for inner-loop. This is very
>> restricted transformation since it accepts outer-loops with very
>> simple cfg, as for example:
>> acc = 0;
>>for (i = 1; i <= m; i++) {
>>   for (j = 0; j < n; j++)
>>  if (l[j] == i) { v[j] = acc; acc++; };
>>   acc <<= 1;
>>}
>>
>> Note that degenerative outer loop (without inner loop) will be
>> completely deleted as dead code.
>> The main goal of this transformation was to convert outer-loop to form
>> accepted by outer-loop vectorization (such test-case is also included
>> to patch).
>>
>> Bootstrap and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>
> I think this is
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
>
> as well.  It has a patch adding a invariant loop guard hoisting
> phase to loop-header copying.  Yeah, it needs updating to
> trunk again I suppose.  It's always non-stage1 when I come
> back to that patch.
>
> Your patch seems to be very specific and only handles outer
> loops of innermost loops.
>
> Richard.
>
>> ChangeLog:
>> 2015-07-10  Yuri Rumyantsev  
>>
>> * tree-ssa-loop-unswitch.c: Include "tree-cfgcleanup.h" and
>> "gimple-iterator.h", add prototype for tree_unswitch_outer_loop.
>> (tree_ssa_unswitch_loops): Add invoke of tree_unswitch_outer_loop.
>> (tree_unswitch_outer_loop): New function.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.dg/tree-ssa/unswitch-outer-loop-1.c: New test.
>> * gcc.dg/vect/vect-outer-simd-3.c: New test.


[patch] PR66714 -- Re: Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-23 Thread Cesar Philippidis
On 07/13/2015 06:43 AM, Michael Matz wrote:

> This also hints at other problems (which might not actually occur in the 
> case at hand, but still): the contents of DECL_VALUE_EXPR is the "real" 
> thing containing the value of a decl (i.e. a decl having a value-expr 
> doesn't itself occur in the code anymore), be it a decl itself, or some 
> expression (which might also refer to decls).  Now, in PR 66714 you 
> analyzed that one of those D* was removed from the function, which should 
> have happened only because no code referred to anymore, i.e. D* was also 
> rewritten to some other D'* (if it weren't rewritten and D* was referred 
> to in code, you would have created a miscompilation).  At that point also 
> the DECL_VALUE_EXPRs need to be rewritten to refer to D'*, not to D* 
> anymore.

The attached patch does just that; it teaches
replace_block_vars_by_duplicates to replace the decls inside the
value-exprs with a duplicate too. It's kind of messy though. At the
moment I'm only considering VAR_DECL, PARM_DECL, RESULT_DECL, ADDR_EXPR,
ARRAY_REF, COMPONENT_REF, CONVERT_EXPR, NOP_EXPR, INDIRECT_REF and
MEM_REFs. I suspect that I may be missing some, but these are the only
ones that were triggered gcc_unreachable during testing.

As Tom mentioned in PR66714, this bug is present on trunk, specifically
in code using omp targets. Is this patch OK for trunk? I bootstrapped
and tested on x86_64-linux-gnu.

Cesar
2015-07-22  Cesar Philippidis  
	Tom de Vries  

	gcc/
	* tree-cfg.c (replace_by_duplicate_decl_value_expr): New function.
	(replace_block_vars_by_duplicates): Ensure that value expr decls
	have been copied usign replace_by_duplicate_decl_value_expr.

	libgomp/
	* testsuite/libgomp.c/pr66714.c: New file.
	

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index fde7fbc..15cb122 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6439,6 +6439,99 @@ replace_by_duplicate_decl (tree *tp, hash_map *vars_map,
   *tp = new_t;
 }
 
+/* Replaces the value expression *TP with a duplicate (belonging to function
+   TO_CONTEXT).  The duplicates are recorded in VARS_MAP.  */
+
+static void
+replace_by_duplicate_decl_value_expr (tree *tp,
+  hash_map *vars_map,
+  tree to_context)
+{
+  tree x = *tp;
+
+  switch (TREE_CODE (*tp))
+{
+case VAR_DECL:
+case PARM_DECL:
+case RESULT_DECL:
+  replace_by_duplicate_decl (tp, vars_map, to_context);
+  break;
+case ADDR_EXPR:
+  {
+	tree expr = TREE_OPERAND (x, 0);
+
+	replace_by_duplicate_decl_value_expr (&expr, vars_map, to_context);
+	*tp = build1 (ADDR_EXPR, TREE_TYPE (x), expr);
+  }
+  break;
+case ARRAY_REF:
+  {
+	tree array = TREE_OPERAND (x, 0);
+	tree index = TREE_OPERAND (x, 1);
+	tree arg2 = TREE_OPERAND (x, 2);
+	tree arg3 = TREE_OPERAND (x, 3);
+
+	replace_by_duplicate_decl (&array, vars_map, to_context);
+	replace_by_duplicate_decl (&index, vars_map, to_context);
+
+	*tp = build4 (ARRAY_REF, TREE_TYPE (x), array, index,
+		  arg2, arg3);
+  }
+  break;
+case COMPONENT_REF:
+  {
+	tree component = TREE_OPERAND (x, 0);
+	tree field = TREE_OPERAND (x, 1);
+	tree ref;
+
+	/* Components may be MEM_REFs.  */
+	replace_by_duplicate_decl_value_expr (&component, vars_map,
+	  to_context);
+	ref = build3 (COMPONENT_REF, TREE_TYPE (field), component,
+		  field, NULL);
+
+	if (TREE_THIS_VOLATILE (x))
+	  TREE_THIS_VOLATILE (ref) |= 1;
+	if (TREE_READONLY (x))
+	  TREE_READONLY (ref) |= 1;
+
+	*tp = ref;
+  }
+  break;
+case CONVERT_EXPR:
+case NOP_EXPR:
+case INDIRECT_REF:
+  {
+	tree expr = TREE_OPERAND (x, 0);
+	tree decl;
+
+	if (CONVERT_EXPR_CODE_P (TREE_CODE (expr)))
+	  decl = TREE_OPERAND (expr, 0);
+	else
+	  decl = expr;
+
+	replace_by_duplicate_decl (&decl, vars_map, to_context);
+
+	if (CONVERT_EXPR_CODE_P (TREE_CODE (expr)))
+	  expr = build1 (TREE_CODE (expr), TREE_TYPE (expr), decl);
+	else
+	  expr = decl;
+
+	*tp = build_simple_mem_ref (expr);
+  }
+  break;
+case MEM_REF:
+  {
+	tree mem = TREE_OPERAND (x, 0);
+
+	replace_by_duplicate_decl_value_expr (&mem, vars_map, to_context);
+	*tp = build_simple_mem_ref (mem);
+  }
+  break;
+default:
+  gcc_unreachable ();
+}
+}
 
 /* Creates an ssa name in TO_CONTEXT equivalent to NAME.
VARS_MAP maps old ssa names and var_decls to the new ones.  */
@@ -6916,7 +7009,11 @@ replace_block_vars_by_duplicates (tree block, hash_map *vars_map,
 	{
 	  if (TREE_CODE (*tp) == VAR_DECL && DECL_HAS_VALUE_EXPR_P (*tp))
 	{
-	  SET_DECL_VALUE_EXPR (t, DECL_VALUE_EXPR (*tp));
+	  tree x = DECL_VALUE_EXPR (*tp);
+
+	  replace_by_duplicate_decl_value_expr (&x, vars_map, to_context);
+
+	  SET_DECL_VALUE_EXPR (t, x);
 	  DECL_HAS_VALUE_EXPR_P (t) = 1;
 	}
 	  DECL_CHAIN (t) = DECL_CHAIN (*tp);
diff --git a/libgomp/testsuite/libgomp.c/pr66714.c b/libgomp/testsuite/libgomp.c/pr66714.c
new file mode 100644
index 000..c9af4a9
--- /dev/null
+++ 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Alexandre Oliva
On Jul 23, 2015, Richard Biener  wrote:

> Hmm, ok.  Does using

>if (currently_expanding_to_rtl)

> work?  I think it's slightly more descriptive.

Yeah.  Thanks, I've tested it with this change, and I'm now checking
this in (full patch first; adjusted incremental patch at the end):

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

From: Alexandre Oliva 

for  gcc/ChangeLog

PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.  Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars.  Check register
use and promoted modes to allow coalescing.  Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise.  Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases.  Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
from...
(expand_one_stack_var): ... this.  New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names.  Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(split_complex_args): Take assign_parm_data_all argument.
Pass it to rtl_for_parm.  Set up rtl and context for split
args.
(assign_parms_augmented_arg_list): Adjust.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations.  Recognize split complex args.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
rtl before testing for pointer bounds.  Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL.
* t

Re: [CHKP, GCC 5] Port a set of stability chkp patches to gcc-5-branch

2015-07-23 Thread H.J. Lu
On Tue, Jul 21, 2015 at 12:04 PM, Jeff Law  wrote:
> On 07/20/2015 04:59 AM, Ilya Enkovich wrote:
>>
>> Ping
>>
>> 2015-06-19 17:10 GMT+03:00 Ilya Enkovich :
>>>
>>> Hi,
>>>
>>> There was a set of stability fixes (mostly different ICEs) for Pointer
>>> Bounds Checker done in GCC 6.  But only few of them were approved to be
>>> ported to GCC 5.  Will it be OK to port other chkp specific stability fixes
>>> to GCC 5?  Here is a list of patches:
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00995.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01067.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01065.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01386.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01248.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01252.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01253.html
>>>   https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01319.html
>
> These are all OK assuming they backport cleanly and bootstrap/regression
> test cleanly on the branch.
>
> jeff

I'd like to backport this MPX testsuite patch:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01438.html

OK for GCC 5?

Thanks.

-- 
H.J.


Re: [Bug fortran/52846] [F2008] Support submodules - part 3/3

2015-07-23 Thread Mikael Morin

Hello Paul,

Le 23/07/2015 09:46, Paul Richard Thomas a écrit :

Since all the private entities in a module have to be transmitted to
their descendant submodules, whilst keeping them hidden from normal
use statements, I have chosen to write the module file as usual and
add a second part that contains the private entities. This latter is
only read when processing submodule statements.

why not write them to the/a .smod file?  It was its primary purpose, 
wasn't it?

[Sorry, I followed the submodule stuff very remotely].

It's probably bad practice to put private entities in module files, at 
least now that submodules are supported.  Nevertheless with your change, 
modifications made to private entities produce recompilation cascades, 
even though the public interfaces are left unchanged.


Mikael


Re: Fold some equal to and not equal to patterns in match.pd

2015-07-23 Thread Jeff Law

On 07/21/2015 06:40 PM, Andrew Pinski wrote:

On Tue, Jul 21, 2015 at 12:16 PM, Richard Biener
 wrote:

On July 21, 2015 11:38:31 AM GMT+02:00, Jakub Jelinek  wrote:

On Tue, Jul 21, 2015 at 09:15:31AM +, Hurugalawadi, Naveen wrote:

Please find attached the patch which performs following patterns

folding

in match.pd:-

a ==/!= a p+ b to b ==/!= 0.
a << N ==/!= 0 to a&(-1>>N) ==/!= 0.


Not sure about this second one.  Why do you think it is generally
beneficial?  On many targets, shifts are as fast as bitwise and, and
-1>>N could be e.g. significantly more expensive constant (say require
3 instructions to construct).


And may set flags while shift not? Of course we do a very poor job of 
representing this kind of stuff on gimple.


The biggest question now becomes which way is the canonical form for
gimple and we can decide to optimize it on the RTL level (combine)
instead if it produces better code in those cases.
Note on AARCH64, producing x&(-1>>N) has no cost difference from artl expansion phase to 
expand using a form that is better for a particluar target.


[And yes, I think I've violated this new guiding principle in the past]

Given the (a << n) EQ/NE form, can we reasonably detect this in the 
gimple->expansion phase and emit code as if we had the alternate form 
for targets such as aarch64?


jeff



Re: Fold some equal to and not equal to patterns in match.pd

2015-07-23 Thread Segher Boessenkool
On Thu, Jul 23, 2015 at 10:09:49AM -0600, Jeff Law wrote:
> It seems to me in these kind of cases that selection of the canonical 
> form should be driven by factors outside of which is better for a 
> particular target.  ie, which is simpler

I agree.  But neither form is simpler here, and we need to have both
forms in other contexts, so there is no real benefit to canonicalising.

> Instead we should be looking at the gimple->rtl expansion phase to 
> expand using a form that is better for a particluar target.
> 
> [And yes, I think I've violated this new guiding principle in the past]
> 
> Given the (a << n) EQ/NE form, can we reasonably detect this in the 
> gimple->expansion phase and emit code as if we had the alternate form 
> for targets such as aarch64?

In gimple, both forms have the same complexity as far as I see.

In RTL both forms have the same complexity as well.  RTX cost can be
higher for the AND form (or, in theory at least, can be lower for some
targets).

In my opinion expand shouldn't do anything special here, just emit RTL
like the gimple it is fed, and let the target deal with it, or the
generic RTL optimisers.  Or do we have an example where that does not
work or is inconvenient?


Segher


[PATCH] Use unshare_expr more in c-ubsan.c

2015-07-23 Thread Marek Polacek
This sprinkles some more unshare_exprs here and there in the ubsan code.
Maybe we'll have to add some more of them elsewhere, too.

Unfortunately this doesn't fix the ARM -Wmaybe-uninitialized issue yet :(.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-07-23  Marek Polacek  

* c-ubsan.c (ubsan_instrument_division): Use unshare_expr throughout.
(ubsan_instrument_shift): Likewise.

diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c
index 3869511..e0cce84 100644
--- gcc/c-family/c-ubsan.c
+++ gcc/c-family/c-ubsan.c
@@ -75,7 +75,7 @@ ubsan_instrument_division (location_t loc, tree op0, tree op1)
   && !TYPE_UNSIGNED (type))
 {
   tree x;
-  tt = fold_build2 (EQ_EXPR, boolean_type_node, op1,
+  tt = fold_build2 (EQ_EXPR, boolean_type_node, unshare_expr (op1),
build_int_cst (type, -1));
   x = fold_build2 (EQ_EXPR, boolean_type_node, op0,
   TYPE_MIN_VALUE (type));
@@ -103,7 +103,7 @@ ubsan_instrument_division (location_t loc, tree op0, tree 
op1)
  TREE_SIDE_EFFECTS (op0) = 1;
}
 }
-  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
+  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
   if (flag_sanitize_undefined_trap_on_error)
 tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
   else
@@ -117,6 +117,8 @@ ubsan_instrument_division (location_t loc, tree op0, tree 
op1)
  ? BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW
  : BUILT_IN_UBSAN_HANDLE_DIVREM_OVERFLOW_ABORT;
   tt = builtin_decl_explicit (bcode);
+  op0 = unshare_expr (op0);
+  op1 = unshare_expr (op1);
   tt = build_call_expr_loc (loc, tt, 3, data, ubsan_encode_value (op0),
ubsan_encode_value (op1));
 }
@@ -152,7 +154,7 @@ ubsan_instrument_shift (location_t loc, enum tree_code code,
   && flag_isoc99)
 {
   tree x = fold_build2 (MINUS_EXPR, op1_utype, uprecm1,
-   fold_convert (op1_utype, op1));
+   fold_convert (op1_utype, unshare_expr (op1)));
   tt = fold_convert_loc (loc, unsigned_type_for (type0), op0);
   tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
   tt = fold_build2 (NE_EXPR, boolean_type_node, tt,
@@ -167,12 +169,13 @@ ubsan_instrument_shift (location_t loc, enum tree_code 
code,
   && (cxx_dialect >= cxx11))
 {
   tree x = fold_build2 (MINUS_EXPR, op1_utype, uprecm1,
-   fold_convert (op1_utype, op1));
-  tt = fold_convert_loc (loc, unsigned_type_for (type0), op0);
+   fold_convert (op1_utype, unshare_expr (op1)));
+  tt = fold_convert_loc (loc, unsigned_type_for (type0),
+unshare_expr (op0));
   tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
   tt = fold_build2 (GT_EXPR, boolean_type_node, tt,
build_int_cst (TREE_TYPE (tt), 1));
-  x = fold_build2 (LT_EXPR, boolean_type_node, op0,
+  x = fold_build2 (LT_EXPR, boolean_type_node, unshare_expr (op0),
   build_int_cst (type0, 0));
   tt = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, x, tt);
 }
@@ -197,7 +200,7 @@ ubsan_instrument_shift (location_t loc, enum tree_code code,
  TREE_SIDE_EFFECTS (op0) = 1;
}
 }
-  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), op0, t);
+  t = fold_build2 (COMPOUND_EXPR, TREE_TYPE (t), unshare_expr (op0), t);
   t = fold_build2 (TRUTH_OR_EXPR, boolean_type_node, t,
   tt ? tt : integer_zero_node);
 
@@ -216,6 +219,8 @@ ubsan_instrument_shift (location_t loc, enum tree_code code,
  ? BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS
  : BUILT_IN_UBSAN_HANDLE_SHIFT_OUT_OF_BOUNDS_ABORT;
   tt = builtin_decl_explicit (bcode);
+  op0 = unshare_expr (op0);
+  op1 = unshare_expr (op1);
   tt = build_call_expr_loc (loc, tt, 3, data, ubsan_encode_value (op0),
ubsan_encode_value (op1));
 }

Marek


Re: [PATCH] Use unshare_expr more in c-ubsan.c

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 07:06:40PM +0200, Marek Polacek wrote:
> This sprinkles some more unshare_exprs here and there in the ubsan code.
> Maybe we'll have to add some more of them elsewhere, too.
> 
> Unfortunately this doesn't fix the ARM -Wmaybe-uninitialized issue yet :(.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2015-07-23  Marek Polacek  
> 
>   * c-ubsan.c (ubsan_instrument_division): Use unshare_expr throughout.
>   (ubsan_instrument_shift): Likewise.

Ok, thanks.

Jakub


Re: [patch] PR66714 -- Re: Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 08:20:50AM -0700, Cesar Philippidis wrote:
> The attached patch does just that; it teaches
> replace_block_vars_by_duplicates to replace the decls inside the
> value-exprs with a duplicate too. It's kind of messy though. At the
> moment I'm only considering VAR_DECL, PARM_DECL, RESULT_DECL, ADDR_EXPR,
> ARRAY_REF, COMPONENT_REF, CONVERT_EXPR, NOP_EXPR, INDIRECT_REF and
> MEM_REFs. I suspect that I may be missing some, but these are the only
> ones that were triggered gcc_unreachable during testing.

Ugh, that looks ugly, why do we have all the tree walkers?
I'd unshare_expr the value expr first, you really don't want to share
it anyway, and then just walk_tree and find all the decls in there
(with *walk_subtrees on types and perhaps something else too) and for them
replace_by_duplicate_decl (tp, vars_map, to_context);

Jakub


Re: PR c/16351 Extend Wnonnull for returns_nonnull

2015-07-23 Thread Jeff Law

On 07/22/2015 09:29 AM, Manuel López-Ibáñez wrote:

While looking at PR c/16351, I noticed that all tests proposed for
-Wnull-attribute
(https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01715.html) could be
warned from the FEs by simply extending the existing Wnonnull.

Bootstrapped and regression tested on x86_64-linux-gnu.

OK?


gcc/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

 PR c/16351
 * doc/invoke.texi (Wnonnull): Document behavior for
 returns_nonnull.

gcc/testsuite/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

 PR c/16351
 * c-c++-common/wnonnull-1.c: New test.

gcc/cp/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

 PR c/16351
 * typeck.c (check_return_expr): Call maybe_warn_returns_nonnull.


gcc/c-family/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

 PR c/16351
 * c-common.c (maybe_warn_returns_nonnull): New.
 * c-common.h (maybe_warn_returns_nonnull): Declare.

gcc/c/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

 PR c/16351
 * c-typeck.c (c_finish_return): Call maybe_warn_returns_nonnull.
FWIW, we have the usual tension here between warning in the front-end vs 
warning after optimization and exploiting dataflow analysis.


Warning in the front-ends like this can generate false positives (such 
as a NULL return in an unreachable path and miss cases where the NULL 
has to be propagated into the return by later optimizations.


However warning in the front-ends will tend to have more stable 
diagnostics from release to release.


Warning after optimization/analysis will tend to generate fewer false 
positives and can pick up cases where the didn't explicitly appear in 
the return statement, but had to be propagated in by the optimizers. Of 
course, these warnings are less stable release-to-release and require 
the optimizers/analysis phases to be run.


I've always preferred exploiting optimization and analysis to both 
reduce false positives and expose the non-trivial which show up via 
optimizations.  But I also understand that's simply my preference and 
that others have a different preference.


I'll tentatively approve for the trunk, but I think we still want 
warnings after optimization/analysis.  Which will likely lead to a 
scheme like I proposed many years for uninitialized variables where we 
have multiple modes.  One warns in the front-end like your implemention 
does, the other defers the warning until after analysis & optimization.


So please keep 16351 open after committing.

Jeff



Re: [patch] Include reduction on libackend.a and language source files

2015-07-23 Thread David Malcolm
On Wed, 2015-07-22 at 20:50 -0400, Andrew MacLeod wrote:

(snip)

> I then ran it through an ordering tool, (which I will eventually put in 
> contrib).  This tool looks at include files, and puts them in a 
> "standard" order, and removes duplicates that have already been 
> included.. even if it is indirectly via another file.  ie, it will 
> remove obstack.h from the list if bitmap.h has been included for instance.
> removing duplicates was a very delicate balancing act when trying to 
> aggregate them with other includes,
> ie
> #include "option.h"
> <...>
> #include "target.h"
> 
> Since target.h includes tm.h (which includes options.h). we don't need 
> to include options.h  BUt there may be header files between the two that 
> require something in options.h, so target.h needs to be moved up to the 
> options.h location.   There are often secondary effects which affect 
> other files, and it turned out to be a frustrating juggling act.  So I 
> wrote the tool to take care of it. The standard "grouping" order of 
> includes turns out to look like :
>"system.h",
>"coretypes.h",
>"backend.h",
>"target.h",
>"rtl.h",
>"tree.h",
>"fortran/gfortran.h",
>"c-family/c-common.h",
>"c/c-tree.h",
>"cp/cp-tree.h",
>"gimple.h",
>"df.h",
>"tm_p.h",
>"gimple-iterators.h",
>"ssa.h",
>"expmed.h",
>"optabs.h",
>"recog.h",
>"gimple-streamer.h"
> 
> This order resolves any issues.  The tool also looks at all the files 
> included by these and avoids including them a second time. Note that it 
> only puts header files into this order which are in the source file. so 
> if backend.h isnt in the file, and  function.h is, function.h will occur 
> where backend.h would be.   Any headers included by backend.h will occur 
> in that relative position, and in the order they are included by backend,.h
> 
> I will eventually put all these tools into a directory in contrib. It 
> simple enough to run it this ordering on any source file.

(snip)

If I'm understanding things right, we will have a standardized include
file ordering.

Can this ordering be documented somewhere in the source tree please?
(and be maintained)  I have a number of local (git) feature branches
that I'm hacking on, and every time I merge changes from master into one
of my branches, I seem to spend some time having to figure out which
include files I need.  I tend to just copy and paste #includes until it
compiles again, so having the order be documented would be helpful.

Hope this is constructive
Dave



New Dutch PO file for 'cpplib' (version 5.2.0)

2015-07-23 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Dutch team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/nl.po

(This file, 'cpplib-5.2.0.nl.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Contents of PO file 'cpplib-5.2.0.nl.po'

2015-07-23 Thread Translation Project Robot


cpplib-5.2.0.nl.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



Re: libstdc++: more __intN tweaks

2015-07-23 Thread DJ Delorie

> > * include/bits/functional_hash.h: Add specializations for __intN
> > types.
> >
> > * include/ext/pb_ds/detail/thin_heap_/thin_heap_.hpp (__gnu_pbds):
> > Guard against values that might exceed size_t's precision.
> 
> Yes, OK - thanks.

Committed.  Thanks!


Re: [PATCH] gcc: fix building w/isl-0.15

2015-07-23 Thread Jeff Law

On 07/21/2015 02:32 PM, Sebastian Pop wrote:


Could somebody with access to sourceware.org upload a tar.bz2 of the required
version of isl from http://isl.gforge.inria.fr/isl-0.15.tar.bz2?

Also, once that is done, I will commit the following patch updating the
documentation.

I've put isl-0.15.tar.bz2 into the ftp directory.

However, I don't think we've changed the required version of ISL for 
gcc.  If we were changing the required version, then I wouldn't have 
bothered to verify that the trunk still works with 0.13 (and the patch 
itself would have been simpler).


What I think we've done is merely allow the use of a newer ISL, possibly 
changing the recommended version, but not the base required version.


Jeff



Re: [C/C++ PATCH] PR c++/66572. Fix Wlogical-op false positive

2015-07-23 Thread Jeff Law

On 07/21/2015 04:56 AM, Marek Polacek wrote:

Ping.

On Tue, Jul 14, 2015 at 06:38:12PM +0200, Marek Polacek wrote:

Ok, in that case I think easiest would the following (I hit the same issue
when writing the -Wtautological-compare patch):

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-07-14  Marek Polacek  

PR c++/66572
* pt.c (tsubst_copy_and_build): Add warn_logical_op sentinel.

* g++.dg/warn/Wlogical-op-2.C: New test.

I realize it's C++, but it's simple enough for me.

Ok for the trunk.

jeff



MAINTAINERS: Update my email address

2015-07-23 Thread Bernd Schmidt
I no longer work at Mentor/CodeSourcery. Until I start my new job in 
September I'll be using my personal email address.


At this time I would like to propose to the SC (two members Cc'ed) that 
Nathan Sidwell be appointed nvptx maintainer instead of me, as he is the 
one who will continue to work in that area and is likely to make the 
majority of all changes to it in the near future.



Bernd
commit 706c66066e216ad5b9f0b807a4640aa3ef6990c3
Author: Bernd Schmidt 
Date:   Thu Jul 23 20:02:45 2015 +0200

	* MAINTAINERS: Update my email address.

diff --git a/ChangeLog b/ChangeLog
index c1582b9..1acacbd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2015-07-23  Bernd Schmidt  
+
+	* MAINTAINERS: Update my email address.
+
 2015-07-14  H.J. Lu  
 
 	Sync with binutils-gdb:
diff --git a/MAINTAINERS b/MAINTAINERS
index 15b5163..bdfd2be 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -31,7 +31,7 @@ Michael Meissner
 Jason Merrill	
 David S. Miller	
 Joseph Myers	
-Bernd Schmidt	
+Bernd Schmidt	
 Ian Lance Taylor
 Jim Wilson	
 
@@ -49,9 +49,9 @@ arm port		Nick Clifton		
 arm port		Richard Earnshaw	
 arm port		Ramana Radhakrishnan	
 avr port		Denis Chertykov		
-bfin port		Bernd Schmidt		
+bfin port		Bernd Schmidt		
 bfin port		Jie Zhang		
-c6x port		Bernd Schmidt		
+c6x port		Bernd Schmidt		
 cris port		Hans-Peter Nilsson	
 epiphany port		Joern Rennecke		
 fr30 port		Nick Clifton		
@@ -89,7 +89,7 @@ nds32 port		Chung-Ju Wu		
 nds32 port		Shiva Chen		
 nios2 port		Chung-Lin Tang		
 nios2 port		Sandra Loosemore	
-nvptx port		Bernd Schmidt		
+nvptx port		Bernd Schmidt		
 pdp11 port		Paul Koning		
 picochip port		Daniel Towner		
 rl78 port		DJ Delorie		
@@ -246,7 +246,7 @@ profile feedback	Jan Hubicka		
 type-safe vectors	Nathan Sidwell		
 alias analysis		Daniel Berlin		
 reload			Ulrich Weigand		
-reload			Bernd Schmidt		
+reload			Bernd Schmidt		
 dfp.c, related		Ben Elliston		
 RTL optimizers		Eric Botcazou		
 instruction combiner	Segher Boessenkool	


Re: [patch] Include reduction on libackend.a and language source files

2015-07-23 Thread Andrew MacLeod

On 07/23/2015 01:44 PM, David Malcolm wrote:

On Wed, 2015-07-22 at 20:50 -0400, Andrew MacLeod wrote:

(snip)


I will eventually put all these tools into a directory in contrib. It
simple enough to run it this ordering on any source file.
(snip)

If I'm understanding things right, we will have a standardized include
file ordering.

Can this ordering be documented somewhere in the source tree please?
(and be maintained)  I have a number of local (git) feature branches
that I'm hacking on, and every time I merge changes from master into one
of my branches, I seem to spend some time having to figure out which
include files I need.  I tend to just copy and paste #includes until it
compiles again, so having the order be documented would be helpful.

In theory,  once I finish up with it and check the tool into contrib, 
all you'll have to do is run


contrib/gcc-order-includes filename

and it'll make the ordering right for whatever headers are in your file

I'm planning to add an option to show the canonical ordering, so
contrib/gcc-order-includes -s
with no filename will list the order it will arrange headers in.

It doesn't deal with every header file, just the primary core ones I 
listed in the original email . arbitrary ones can appear in whatever 
order they want.


If we have headers with implicit dependencies, the tool can be updated 
with that dependency and include those headers..  I haven't quite gotten 
that far yet.  When IM done with the reorg, I'll look at what other 
implcit orderings there are.


This doesnt help you set up an initial include list, other than you can 
start with this canonical list and add whatever else you need.


Again, in theory, you will be able to run the include removal tool to 
get rid of the ones you dont need... but that might be overkill for a 
single file without some further tweaking


Andrew


Re: [PATCH] fix in-tree-binutils builds

2015-07-23 Thread Mike Stump
On Jul 23, 2015, at 7:47 AM, H.J. Lu  wrote:
> On Fri, Jul 17, 2015 at 7:43 AM, H.J. Lu  wrote:
>> On Wed, Jul 15, 2015 at 9:47 AM, Mike Stump  wrote:
>>> On Jul 15, 2015, at 9:07 AM, H.J. Lu  wrote:
 On Wed, Jul 15, 2015 at 1:03 AM, Jan Beulich  wrote:
> 
> - $gcc_cv_as_gas_srcdir/configure.in \
> + $gcc_cv_as_gas_srcdir/configure.[ai][cn] \
> $gcc_cv_as_gas_srcdir/Makefile.in ; do
>  gcc_cv_gas_version=`sed -n -e 's/^[[ 
> ]]*VERSION=[[^0-9A-Za-z_]]*\([[0-9]]*\.[[0-9]]*.*\)/VERSION=\1/p' < $f`
 
 How portable is [ai][cn]?
>>> 
>>> Should be portable enough.
>> 
>> Are there any objections to this patch?
> 
> Can we check in this patch?

I’m not a build system nor a global reviewer so i have no more authority to say 
yes than you.

Seems kinda obvious and trivial to me.  The goal of fixing in-tree binutils 
builds seems reasonable to me.

Re: [PATCH 3/4] Add libgomp plugin for Intel MIC

2015-07-23 Thread Ilya Verbin
On Wed, Jul 08, 2015 at 16:16:44 +0200, Thomas Schwinge wrote:
> > --- /dev/null
> > +++ b/liboffloadmic/plugin/Makefile.am
> > @@ -0,0 +1,123 @@
> > +# Plugin for offload execution on Intel MIC devices.
> 
> > +main_target_image.h: offload_target_main
> > +   @echo -n "const int image_size = " > $@
> > +   @stat -c '%s' $< >> $@
> > +   @echo ";" >> $@
> > +   @echo "struct MainTargetImage {" >> $@
> > +   @echo "  int64_t size;" >> $@
> > +   @echo "  char name[sizeof \"offload_target_main\"];" >> $@
> > +   @echo "  char data[image_size];" >> $@
> > +   @echo "};" >> $@
> > +   @echo "extern \"C\" const MainTargetImage main_target_image = {" >> $@
> > +   @echo "  image_size, \"offload_target_main\"," >> $@
> > +   @cat $< | xxd -include >> $@
> > +   @echo "};" >> $@
> > +
> > +offload_target_main: $(liboffload_dir)/ofldbegin.o offload_target_main.o 
> > $(liboffload_dir)/ofldend.o
> > +   $(CXX) $(AM_LDFLAGS) $^ -o $@
> > +
> > +offload_target_main.o: offload_target_main.cpp
> > +   $(CXX) $(AM_CXXFLAGS) $(AM_CPPFLAGS) -c $< -o $@
> 
> Here, I note that the xxd tool is being used, which in my distribution is
> part of the Vim editor's package, which -- as far as I know -- is not
> currently declared as a build dependency of GCC?

We have a patch, which checks for xxd availability, is it ok for trunk?


2015-07-23  Maxim Blumenthal  

* configure.ac: Add a check for xxd presence when the target is
intelmic or intelmicemul.
* configure: Regenerate.


diff --git a/configure b/configure
index 5ba9489..bd8fed8 100755
--- a/configure
+++ b/configure
[regenerate]

diff --git a/configure.ac b/configure.ac
index 2ff9be0..63eebfc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -494,6 +494,17 @@ else
 fi])
 AC_SUBST(extra_liboffloadmic_configure_flags)
 
+# Check if xxd is present in the system
+# when the target is intelmic or intelmicemul.
+case "${target}" in
+  *-intelmic-* | *-intelmicemul-*)
+AC_CHECK_PROG(xxd_present, xxd, "yes", "no")
+if test "$xxd_present" = "no"; then
+  AC_MSG_ERROR([cannot find xxd])
+fi
+;;
+esac
+
 # Save it here so that, even in case of --enable-libgcj, if the Java
 # front-end isn't enabled, we still get libgcj disabled.
 libgcj_saved=$libgcj


  -- Ilya


Re: [PATCH, rtl-opt, i386]: Backport fix for PR 58066, __tls_get_addr is called with misaligned stack on x86-64

2015-07-23 Thread Uros Bizjak
On Mon, Jul 20, 2015 at 5:00 PM, Uros Bizjak  wrote:
> Attached patch backports fixes for PR 58066 to release branches.
>
> 2015-07-XX  Uros Bizjak  
>
> Backport from mainline:
> 2015-07-17  Uros Bizjak  
>
> PR rtl-optimization/66891
> * calls.c (expand_call): Wrap precompute_register_parameters with
> NO_DEFER_POP/OK_DEFER_POP to prevent deferred pops.
>
> 2015-07-15  Uros Bizjak  
>
> PR target/58066
> * config/i386/i386.md (*tls_global_dynamic_64_): Depend on SP_REG.
> (*tls_local_dynamic_base_64_): Ditto.
> (*tls_local_dynamic_base_64_largepic): Ditto.
> (tls_global_dynamic_64_): Update expander pattern.
> (tls_local_dynamic_base_64_): Ditto.
>
> 2015-07-15  Uros Bizjak  
>
> PR rtl-optimization/58066
> * calls.c (expand_call): Precompute register parameters before stack
>
> testsuite/ChangeLog:
>
> 2015-07-XX  Uros Bizjak  
>
> Backport from mainline:
> 2015-07-17  Uros Bizjak  
>
> PR target/66891
> * gcc.target/i386/pr66891.c: New test.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}  for
> all default languages, obj-c++ and go.
>
> OK for branches?

Committed to gcc-5 branch after the patch was approved offline by Jeff.

I will wait a week for possible fallout and then apply the patch to
gcc-49 branch.

Uros.


Re: [PR66726] Factor conversion out of COND_EXPR

2015-07-23 Thread Jeff Law

On 07/15/2015 11:52 PM, Kugan wrote:




diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 932c83a..3058eb5 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c



   return false;
 bb = gimple_bb (stmt);
 if (!single_succ_p (bb))
@@ -2729,9 +2743,8 @@ final_range_test_p (gimple stmt)

 lhs = gimple_assign_lhs (stmt);
 rhs = gimple_assign_rhs1 (stmt);
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-  || TREE_CODE (rhs) != SSA_NAME
-  || TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE)
+  if (TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE
+  && TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE)
   return false;

So you're ensuring that one of the two is a boolean...  Note that
previously we ensured that the rhs was a boolean and the lhs was an
integral type (which I believe is true for booleans).

Thus if we had
bool x;
int y;

x = (bool) y;

The old code would have rejected that case.  But I think it gets through
now, right?

I think once that issue is addressed, this will be good for the trunk.



Thanks for the review. How about:

-  if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-  || TREE_CODE (rhs) != SSA_NAME
-  || TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE)
+  if (gimple_assign_cast_p (stmt)
+  && (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+ || TREE_CODE (rhs) != SSA_NAME
+ || TREE_CODE (TREE_TYPE (rhs)) != BOOLEAN_TYPE))
But then I think you need to verify that for the  _234 = a_2(D) == 2; 
case that type of the RHS is a boolean.


ie, each case has requirements for the types.  I don't think they can be 
reasonably unified.  So something like this:


if (gimple_assign_cast_p (stmt)
&& ! (correct types for cast)
   return false;

if (!gimple_assign_cast_p (stmt)
&& ! (correct types for tcc_comparison case))
  return false;


This works because we've already verified that it's either a type 
conversion or a comparison on the RHS.


Jeff


Go patch committed

2015-07-23 Thread Ian Lance Taylor
This patch from Chris Manghane avoids a compiler crash for some kinds
of invalid code.  This is http://golang.org/issue/11592 .
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 226009)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5c49a77455f52ba2c7eddb5b831456dc1c67b02f
+b4a932b4a51b612cadcec93a83f94d6ee7d7d190
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 226007)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -3955,6 +3955,8 @@ Unary_expression::do_check_types(Gogo*)
   // Indirecting through a pointer.
   if (type->points_to() == NULL)
this->report_error(_("expected pointer"));
+  if (type->points_to()->is_error())
+   this->set_is_error();
   break;
 
 default:


Re: [PATCH 1/2] Allow REG_EQUAL for ZERO_EXTRACT

2015-07-23 Thread Jeff Law

On 07/19/2015 09:17 PM, Kugan wrote:

I have made a mistake while addressing the review comments for this
patch. Unfortunately, It was not detected in my earlier testing. My
sincere graphology for the mistake.

I have basically missed the STRICT_LOW_PART check for the first if-check
thus the second part (which is the ZERO_EXTRACT part) will never get
executed. Attached patch fixes this along with some minor changes.

Bootstrapped and regression tested on arm-none-linux (Chromebook) and
x86-64-linux-gnu with no new regression along with the ARM ennoblement
patch.

Also did a complete arm qemu regression testing with Chriophe's scripts
with no new regression.
(http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/225987-reg4/report-build-info.html)

Is this OK for trunk,


Thanks,
Kugan

gcc/ChangeLog:

2015-07-20  Kugan Vivekanandarajah  

* cse.c (cse_insn): Fix missing check for STRICT_LOW_PART and minor
clean up.

OK.
jeff



Re: [patch] Include reduction on libackend.a and language source files

2015-07-23 Thread Jeff Law

On 07/22/2015 06:50 PM, Andrew MacLeod wrote:

This is the result of running include reduction on all the files which
make up libbackend.a, as well as most of the language files found in
subdirectories  lto, c ,cp, java, go, fortran, jit, ada. well, some of
ada. :-)

I looked at the output and hand tweaked a few things... removing
comments that no longer made sense and stuff like that.

The reduction tool was run across all the targets to pick up macros that
might be defined.  An Include file was not removed if it defined a macro
which was used in a conditional expression (ie #if) either in the source
file, or in other includes files which were determined to be required.
During removal, the header was removed on the host machine, and if
compilation was successful, the tool proceeded to try it on all
targets.  I did a dry run on all 201 functioning targets, and the
results from 1.7 million lines of log file showed that full coverage can
be attained  with 13 targets:
  aarch64-linux-gnu arm-netbsdelf avr-rtems c6x-elf epiphany-elf
hppa2.0-hpux10.1 i686-mingw32crt i686-pc-msdosdjgpp mipsel-elf
powerpc-eabisimaltivec rs6000-ibm-aix5.1.0 sh-superh-elf sparc64-elf
spu-elf

The final run was on the coverage targets, and ran much much faster.

I then ran it through an ordering tool, (which I will eventually put in
contrib).  This tool looks at include files, and puts them in a
"standard" order, and removes duplicates that have already been
included.. even if it is indirectly via another file.  ie, it will
remove obstack.h from the list if bitmap.h has been included for instance.
removing duplicates was a very delicate balancing act when trying to
aggregate them with other includes,
ie
#include "option.h"
<...>
#include "target.h"

Since target.h includes tm.h (which includes options.h). we don't need
to include options.h  BUt there may be header files between the two that
require something in options.h, so target.h needs to be moved up to the
options.h location.   There are often secondary effects which affect
other files, and it turned out to be a frustrating juggling act.  So I
wrote the tool to take care of it. The standard "grouping" order of
includes turns out to look like :
   "system.h",
   "coretypes.h",
   "backend.h",
   "target.h",
   "rtl.h",
   "tree.h",
   "fortran/gfortran.h",
   "c-family/c-common.h",
   "c/c-tree.h",
   "cp/cp-tree.h",
   "gimple.h",
   "df.h",
   "tm_p.h",
   "gimple-iterators.h",
   "ssa.h",
   "expmed.h",
   "optabs.h",
   "recog.h",
   "gimple-streamer.h"

This order resolves any issues.  The tool also looks at all the files
included by these and avoids including them a second time. Note that it
only puts header files into this order which are in the source file. so
if backend.h isnt in the file, and  function.h is, function.h will occur
where backend.h would be.   Any headers included by backend.h will occur
in that relative position, and in the order they are included by backend,.h

I will eventually put all these tools into a directory in contrib. It
simple enough to run it this ordering on any source file.

This patch is my best effort at a correct include reduction. I
bootstrapped it on both x86_64-unknown-linux-gnu and
powerpc64le-unknown-linux-gnu, with no regressions on either host. It
builds all 201 config-list.mk targets which currently build.  Im not
aware of any bugs in the tools and my testing seems to show they work
OK. , and everything seems sane.  Some files look dramatically better :-)

ok for trunk?

I will make tweaks to the tool in order to do the config directories
next, and a few remaining files that have not been reduced yet.

Given the mechanical nature and size, I just did spot checks which 
looked fine.


OK for the trunk.

jeff



Go patch committed

2015-07-23 Thread Ian Lance Taylor
This patch from Chris Manghane makes empty interface types for
variables at parse time when there are no methods.  This is normally
cleaned up later, but for sink variables that clean up never happens
and we get an internal compiler error.  This fixes
https://golang.org/issue/11579.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 226122)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-b4a932b4a51b612cadcec93a83f94d6ee7d7d190
+cbb27e8089e11094a20502e53ef69c9c36955f85
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/parse.cc
===
--- gcc/go/gofrontend/parse.cc  (revision 225750)
+++ gcc/go/gofrontend/parse.cc  (working copy)
@@ -1225,7 +1225,11 @@ Parse::interface_type(bool record)
   methods = NULL;
 }
 
-  Interface_type* ret = Type::make_interface_type(methods, location);
+  Interface_type* ret;
+  if (methods == NULL)
+ret = Type::make_empty_interface_type(location);
+  else
+ret = Type::make_interface_type(methods, location);
   if (record)
 this->gogo_->record_interface_type(ret);
   return ret;


Re: [PATCH] Simple optimization for MASK_STORE.

2015-07-23 Thread Jeff Law

On 07/20/2015 09:05 AM, Yuri Rumyantsev wrote:

Hi Jeff!

Thanks for your details comments.

You asked:
How are conditionals on vectors usually handled?  You should try to
mimick that and report, with detail, why that's not working.

In current implementation of vectorization pass all comparisons are
handled through COND_EXPR, i.e. vect-pattern pass transforms all
comparisons producing bool values to conditional expressions like a[i]
!= 0  --> a[i]!=0? 1: 0 which vectorizers transforms to
vect-cond-expr. So we don't have operations with vector operands and
scalar (bool) result.
To implement such operations I introduced target-hook. Could you
propose another solution implementing it?
Is there any rationale given anywhere for the transformation into 
conditional expressions?  ie, is there any reason why we can't have a 
GIMPLE_COND where the expression is a vector condition?


Thanks,

Jeff



Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Segher Boessenkool
On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
> Yeah.  Thanks, I've tested it with this change, and I'm now checking
> this in (full patch first; adjusted incremental patch at the end):

Unfortunately it causes about a thousand test fails on powerpc64-linux
(at least, it seems to be this patch, I haven't actually checked).

Some representative backtraces:


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c: In 
function 'f1':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c:13:1: 
internal compiler error: in expand_one_stack_var_1, at cfgexpand.c:1221
0x1030eae7 expand_one_stack_var_1
/home/segher/src/gcc/gcc/cfgexpand.c:1221
0x10320a23 expand_one_ssa_partition
/home/segher/src/gcc/gcc/cfgexpand.c:1295
0x10320a23 expand_used_vars
/home/segher/src/gcc/gcc/cfgexpand.c:1940
0x10322ea3 execute
/home/segher/src/gcc/gcc/cfgexpand.c:6084


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c: In 
function 'vq_nbest':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c:6:1: 
internal compiler error: in emit_move_insn, at expr.c:3552
0x1046f587 emit_move_insn(rtx_def*, rtx_def*)
/home/segher/src/gcc/gcc/expr.c:3551
0x104daa67 assign_parm_setup_reg
/home/segher/src/gcc/gcc/function.c:3322
0x104dd063 assign_parms
/home/segher/src/gcc/gcc/function.c:3766
0x104e0aa7 expand_function_start(tree_node*)
/home/segher/src/gcc/gcc/function.c:5192
0x10322f07 execute
/home/segher/src/gcc/gcc/cfgexpand.c:6105


I have the full testsuite logs if you want them.


Segher


Re: [patch] Include reduction on libackend.a and language source files

2015-07-23 Thread Andrew MacLeod

On 07/23/2015 03:55 PM, Jeff Law wrote:

On 07/22/2015 06:50 PM, Andrew MacLeod wrote:



Given the mechanical nature and size, I just did spot checks which 
looked fine.


OK for the trunk.

jeff

I'll wait until early next week.. shortly I'll y be off-line until then 
so couldnt address any fallout if it happens.  Gives a chance for anyone 
else to have a look as well.


Andrew


Re: [PATCH, PR ipa/66566] Fix ICE in early_inliner: internal compiler error: in operator[]

2015-07-23 Thread Jeff Law

On 07/20/2015 06:08 AM, Ilya Enkovich wrote:

Ping

2015-07-13 11:47 GMT+03:00 Ilya Enkovich :

Ping

2015-06-18 12:54 GMT+03:00 Ilya Enkovich :

Hi,

In early_inliner we do recompute inline summaries for edges after 
optimize_inline_calls, but check this summary exists in case new edges appear.  
But then it calls inline_update_overall_summary which also going through edges 
inline summaries but with no check this time causing segfault.  This patch 
fixes it.  Bootstrapped and regtested for x86_64-unknown-linux-gnu.  Is it OK 
for trunk and gcc-5-branch?

Thanks,
Ilya
--
gcc/

2015-06-18  Ilya Enkovich  

 PR ipa/66566
 * ipa-inline-analysis.c (estimate_calls_size_and_time): Check
 edge summary is available.

gcc/testsuite/

2015-06-18  Ilya Enkovich  

 PR ipa/66566
 * gcc.target/i386/mpx/pr66566.c: New test.

OK.
jeff


Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-07-23 Thread Jeff Law

On 07/13/2015 08:03 AM, Kyrill Tkachov wrote:


2015-07-13  Kyrylo Tkachov 

 * ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
 then_cost, else_cost fields.  Change branch_cost field to unsigned
int.
 (end_ifcvt_sequence): Call set_used_flags on each insn in the
 sequence.
 (noce_simple_bbs): New function.
 (noce_try_move): Bail if basic blocks are not simple.
 (noce_try_store_flag): Likewise.
 (noce_try_store_flag_constants): Likewise.
 (noce_try_addcc): Likewise.
 (noce_try_store_flag_mask): Likewise.
 (noce_try_cmove): Likewise.
 (noce_try_minmax): Likewise.
 (noce_try_abs): Likewise.
 (noce_try_sign_mask): Likewise.
 (noce_try_bitop): Likewise.
 (bbs_ok_for_cmove_arith): New function.
 (noce_emit_all_but_last): Likewise.
 (noce_emit_insn): Likewise.
 (noce_emit_bb): Likewise.
 (noce_try_cmove_arith): Handle non-simple basic blocks.
 (insn_valid_noce_process_p): New function.
 (bb_valid_for_noce_process_p): Likewise.
 (noce_process_if_block): Allow non-simple basic blocks
 where appropriate.


2015-07-13  Kyrylo Tkachov 

 * gcc.dg/ifcvt-1.c: New test.
 * gcc.dg/ifcvt-2.c: Likewise.
 * gcc.dg/ifcvt-3.c: Likewise.






Thanks,
Kyrill



cheers,



ifcvt.patch


commit bc62987a2fa3d9dc3de5a1ed8003a745340255bd
Author: Kyrylo Tkachov
Date:   Wed Jul 8 15:45:04 2015 +0100

 [PATCH][ifcvt] Make non-conditional execution if-conversion more aggressive

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 31849ee..2f0a228 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1584,6 +1630,96 @@ noce_try_cmove (struct noce_if_info *if_info)
return FALSE;
  }

+/* Return true iff the registers that the insns in BB_A set do not
+   get used in BB_B.  */
+
+static bool
+bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
+{
+  rtx_insn *a_insn;
+  FOR_BB_INSNS (bb_a, a_insn)
+{
+  if (!active_insn_p (a_insn))
+   continue;
+
+  rtx sset_a = single_set (a_insn);
+
+  if (!sset_a)
+   return false;
+
+  rtx dest_reg = SET_DEST (sset_a);
+  rtx_insn *b_insn;
+
+  FOR_BB_INSNS (bb_b, b_insn)
+   {
+ if (!active_insn_p (b_insn))
+   continue;
+
+ rtx sset_b = single_set (b_insn);
+
+ if (!sset_b)
+   return false;
+
+ if (reg_referenced_p (dest_reg, sset_b))
+   return false;
+   }
+}
+
+  return true;
+}
Doesn't this have the potential to get very expensive since you end up 
looking at every insn in BB_B for every insn in BB_A?


Wouldn't it be better to walk BB_A, gathering the set of all the 
registers modified, then do a single walk through BB testing for uses of 
those registers?


Don't you have to be careful with MEMs in both blocks?




+
+/* Emit copies of all the active instructions in BB except the last.
+   This is a helper for noce_try_cmove_arith.  */
+
+static void
+noce_emit_all_but_last (basic_block bb)
+{
+  rtx_insn *last = last_active_insn (bb, FALSE);
+  rtx_insn *insn;
+  FOR_BB_INSNS (bb, insn)
+{
+  if (insn != last && active_insn_p (insn))
+   {
+ rtx_insn *to_emit = as_a  (copy_rtx (insn));
+
+ emit_insn (PATTERN (to_emit));
+   }
+}
+}
Won't this create invalid RTL sharing?  RTL has strict rules about nodes 
can and can not be shared and I'm pretty sure this blindly shares 
everything.


Now we may get away with that because you're going to delete all the 
insns from BB.  But that begs the question why not just move the insns 
from BB to their new location rather than re-emiting them?






+
+/* Helper for noce_try_cmove_arith.  Emit the pattern TO_EMIT and return
+   the resulting insn or NULL if it's not a valid insn.  */
+
+static rtx_insn *
+noce_emit_insn (rtx to_emit)
+{
+  gcc_assert (to_emit);
+  rtx_insn *insn = emit_insn (to_emit);
+
+  if (recog_memoized (insn) < 0)
+return NULL;
+
+  return insn;
+}
+
+/* Helper for noce_try_cmove_arith.  Emit a copy of the insns up to
+   and including the penultimate one in BB if it is not simple
+   (as indicated by SIMPLE).  Then emit LAST_INSN as the last
+   insn in the block.  The reason for that is that LAST_INSN may
+   have been modified by the preparation in noce_try_cmove_arith.  */
+
+static bool
+noce_emit_bb (rtx last_insn, basic_block bb, bool simple)
+{
+  if (bb && !simple)
+noce_emit_all_but_last (bb);
+
+  if (last_insn && !noce_emit_insn (last_insn))
+return false;
+
+  return true;
+}
Under what conditions can noce_emit_insn fail and what happens to the 
insn stream if it does?  It seems to me like the insn stream would be 
bogus and we should stop compilation.  Which argues that rather than 
returning a bool, we should just assert that the insn is memoized and 
remove the check in noce_emit_bb.


Or is it the case that we're emitting onto a sequence that we can just 
throw away in the event of a failure?



What is the meaning of the return value from noc

[PR lto/66752] Fix missed FSM jump thread

2015-07-23 Thread Jeff Law


This patch gives the FSM jump threading code the opportunity to find 
known values when we have a condition like (x != 0).  Previously it just 
allowed naked SSA_NAMES (which is what appears in a SWITCH_EXPR).  This 
patch allows us to thread a COND_EXPR.


Basically given (x != 0), we just ask the FSM bits to do their thing on 
(x) and the right things just happen.


This exposed two bugs.

First, if the FSM bits thread a gimple conditional, they need to make 
sure to clean up the edge flags on the sole remaining edge.  One of the 
new tests checks this (triggered a checking failure because of the 
lingering edge flags).


Second, when building up the FSM path, we failed to check the partial 
path against nodes we'd already visited and add the nodes from the 
partial path to the list of nodes we had visited.  This can result in 
the same node appearing multiple times in a path (reached from different 
preds in fact).  That triggered another failure which is tested by one 
of the new tests.


Of course I'm also including a test for pr 66752's missed optimization :-0

Bootstrapped and regression tested on x86_64-unknown-linux.  Installed 
on the trunk.


Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 93647bb..26f93fb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2015-01-12  Jeff Law  
+
+   PR lto/66752
+   * tree-ssa-threadedge.c (simplify_conrol_stmt_condition): If we are
+   unable to find X NE 0 in the tables, return X as the simplified
+   condition.
+   (fsm_find_control_statement_thread_paths): If nodes in NEXT_PATH are
+   in VISISTED_BBS, then return failure.  Else add nodes from NEXT_PATH
+   to VISISTED_BBS.  */
+   * tree-ssa-threadupdate.c (duplicate_thread_path): Fix up edge flags
+   after removing the control flow statement and unnecessary edges.
+
 2015-07-23  Bernd Edlinger  
 
* tree-pass.h (get_current_pass_name): Removed.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 6bbc2e6..d8742d3 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2015-07-23  Jeff Law  
+
+   PR lto/66752
+   * gcc.dg/tree-ssa/pr66752-2.c: New test.
+   * gcc.dg/torture/pr66752-1.c: New test
+   * g++.dg/torture/pr66752-2.C: New test.
+
 2015-07-23  Marek Polacek  
 
PR c++/66572
diff --git a/gcc/testsuite/g++.dg/torture/pr66752-2.C 
b/gcc/testsuite/g++.dg/torture/pr66752-2.C
new file mode 100644
index 000..96d3fe9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr66752-2.C
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+extern "C"
+{
+  typedef struct _IO_FILE FILE;
+  extern int fprintf (FILE * __restrict __stream,
+ const char *__restrict __format, ...);
+}
+typedef union tree_node *tree;
+class ipa_polymorphic_call_context
+{
+};
+class ipcp_value_base
+{
+};
+template < typename valtype > class ipcp_value:public ipcp_value_base
+{
+public:valtype value;
+  ipcp_value *next;
+};
+
+template < typename valtype > class ipcp_lattice
+{
+public:ipcp_value < valtype > *values;
+  void print (FILE * f, bool dump_sources, bool dump_benefits);
+};
+
+class ipcp_param_lattices
+{
+public:ipcp_lattice < tree > itself;
+  ipcp_lattice < ipa_polymorphic_call_context > ctxlat;
+};
+template < typename valtype > void ipcp_lattice < valtype >::print (FILE * f,
+   bool
+   
dump_sources,
+   bool
+   
dump_benefits)
+{
+  ipcp_value < valtype > *val;
+  bool prev = false;
+  for (val = values; val; val = val->next)
+{
+  if (dump_benefits && prev)
+   fprintf (f, "   ");
+  else if (!dump_benefits && prev)
+   fprintf (f, ", ");
+  else
+   prev = true;
+  if (dump_sources)
+   fprintf (f, "]");
+  if (dump_benefits)
+   fprintf (f, "shit");
+}
+}
+
+void
+print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
+{
+  struct ipcp_param_lattices *plats;
+  plats->ctxlat.print (f, dump_sources, dump_benefits);
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr66752-1.c 
b/gcc/testsuite/gcc.dg/torture/pr66752-1.c
new file mode 100644
index 000..a742555
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr66752-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+
+typedef unsigned int size_t;
+struct fde_vector
+{
+  size_t count;
+  const struct dwarf_fde *array[];
+};
+struct object;
+typedef struct dwarf_fde fde;
+typedef int (*fde_compare_t) (struct object *, const fde *, const fde *);
+void
+fde_merge (struct object *ob, fde_compare_t fde_compare,
+  struct fde_vector *v1, struct fde_vector *v2)
+{
+  size_t i1, i2;
+  const fde *fde2;
+  do
+{
+  i2--;
+  while (i1 > 0 && fde_compare (ob, v1->array[i1 - 1], fde2) > 0)
+ 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 wrote:
> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>> this in (full patch first; adjusted incremental patch at the end):
>
> Unfortunately it causes about a thousand test fails on powerpc64-linux
> (at least, it seems to be this patch, I haven't actually checked).
>

It also caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

-- 
H.J.


Re: [PATCH, MIPS] Scheduling for M51xx core family

2015-07-23 Thread Richard Sandiford
Sorry for the slow reply, been away.

Matthew Fortune  writes:
> Richard Sandiford  writes:
>> Robert Suchanek  writes:
>> > @@ -771,7 +771,8 @@ struct mips_cpu_info {
>> >
>> >  /* Infer a -mnan=2008 setting from a -mips argument.  */
>> >  #define MIPS_ISA_NAN2008_SPEC \
>> > -  "%{mnan*:;mips32r6|mips64r6:-mnan=2008}"
>> > +  "%{mnan*:;mips32r6|mips64r6:-mnan=2008;march=m51*: \
>> > +   %{!msoft-float:-mnan=2008}}"
>> 
>> Did you need this, or was it for completeness?  MIPS_ISA_NAN2008_SPEC
>> should only be used after MIPS_ISA_LEVEL_SPEC, so I would have expected
>> the mips32r6|mips64r6: case to fire for -march=m51* too, ahead of the
>> new case.
>
> The m5100 is a MIPS32R5 but with a NAN2008 FPU which is why there is the
> special case. The soft-float case is to try and limit the number of
> multilib variants required so we stick to nan legacy for softfloat by
> default.

Ah, OK.  In that case please ignore me :-)

Thanks,
Richard


[PATCH, committed] jit: supply MULTILIB_DEFAULTS as arguments when invoking driver

2015-07-23 Thread David Malcolm
32-bit and 64-bit multilib peer builds of libgccjit.so could share
a driver binary.  When such a libgccjit invokes the driver (e.g. to
convert .s to .so) it needs to pass in options to the shared driver
to get the appropriate assembler/linker options.

The simplest way to do this is for libgccjit to supply any arguments
found in MULTILIB_DEFAULTS when invoking the driver.

Tested on x86_64 Fedora 20, running "make check-jit"; jit.sum on trunk
continues to have 8494 passes after the patch.

The options are visible in the jit logfiles: after the patch I see:

JIT: entering: void gcc::jit::playback::context::convert_to_dso(const char*)
JIT:  entering: void gcc::jit::playback::context::invoke_driver(const 
char*, const char*, const char*, timevar_id_t, bool, bool)
JIT:   entering: void 
gcc::jit::playback::context::add_multilib_driver_arguments(vec*)
JIT:   exiting: void 
gcc::jit::playback::context::add_multilib_driver_arguments(vec*)
JIT:   argv[0]: x86_64-unknown-linux-gnu-gcc-6.0.0
JIT:   argv[1]: -m64
JIT:   argv[2]: -shared
JIT:   argv[3]: /tmp/libgccjit-WH0Mxy/fake.s
JIT:   argv[4]: -o
JIT:   argv[5]: /tmp/libgccjit-WH0Mxy/fake.so
JIT:   argv[6]: -fno-use-linker-plugin
JIT:   argv[7]: (null)
JIT:  exiting: void gcc::jit::playback::context::invoke_driver(const char*, 
const char*, const char*, timevar_id_t, bool, bool)
JIT: exiting: void gcc::jit::playback::context::convert_to_dso(const char*)

(note the "argv[1]: -m64" line)

Committed to trunk as r226126.

gcc/jit/ChangeLog:
* jit-playback.c (invoke_driver): Convert local "argvec"
to an auto_argvec, so that it owns copies of the strings,
rather than borrows them, updating ADD_ARG to use xstrdup
and special-casing the NULL terminator to avoid
xstrdup (NULL).  Call add_multilib_driver_arguments at the front
of the arguments.
(MULTILIB_DEFAULTS): Provide a default definition.
(multilib_defaults_raw): New constant array.
(gcc::jit::playback::context::add_multilib_driver_arguments): New
method.
* jit-playback.h
(gcc::jit::playback::context::add_multilib_driver_arguments): New
method.
* docs/internals/test-hello-world.exe.log.txt: Update.
* docs/_build/texinfo/libgccjit.texi: Regenerate.
---
 .../docs/internals/test-hello-world.exe.log.txt| 26 +++
 gcc/jit/jit-playback.c | 38 --
 gcc/jit/jit-playback.h |  3 ++
 3 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/gcc/jit/docs/internals/test-hello-world.exe.log.txt 
b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
index 5cb3aef..d82038b 100644
--- a/gcc/jit/docs/internals/test-hello-world.exe.log.txt
+++ b/gcc/jit/docs/internals/test-hello-world.exe.log.txt
@@ -1,4 +1,4 @@
-JIT: libgccjit (GCC) version 5.0.0 20150123 (experimental) 
(x86_64-unknown-linux-gnu)
+JIT: libgccjit (GCC) version 6.0.0 20150723 (experimental) 
(x86_64-unknown-linux-gnu)
 JIT:   compiled by GNU C version 4.8.3 20140911 (Red Hat 4.8.3-7), GMP version 
5.1.2, MPFR version 3.1.2, MPC version 1.0.1
 JIT: entering: gcc_jit_context_set_str_option
 JIT:  GCC_JIT_STR_OPTION_PROGNAME: "./test-hello-world.c.exe"
@@ -64,6 +64,7 @@ JIT:   GCC_JIT_BOOL_OPTION_DUMP_SUMMARY: false
 JIT:   GCC_JIT_BOOL_OPTION_DUMP_EVERYTHING: false
 JIT:   GCC_JIT_BOOL_OPTION_SELFCHECK_GC: true
 JIT:   GCC_JIT_BOOL_OPTION_KEEP_INTERMEDIATES: false
+JIT:   gcc_jit_context_set_bool_allow_unreachable_blocks: false
 JIT:   entering: void gcc::jit::recording::context::validate()
 JIT:   exiting: void gcc::jit::recording::context::validate()
 JIT:   entering: 
gcc::jit::playback::context::context(gcc::jit::recording::context*)
@@ -115,12 +116,6 @@ JIT:  exiting: void 
gcc::jit::playback::function::postprocess()
 JIT:  entering: void gcc::jit::playback::function::postprocess()
 JIT:  exiting: void gcc::jit::playback::function::postprocess()
 JIT: exiting: void gcc::jit::playback::context::replay()
-JIT: entering: void jit_langhook_write_globals()
-JIT:  entering: void gcc::jit::playback::context::write_global_decls_1()
-JIT:  exiting: void gcc::jit::playback::context::write_global_decls_1()
-JIT:  entering: void gcc::jit::playback::context::write_global_decls_2()
-JIT:  exiting: void gcc::jit::playback::context::write_global_decls_2()
-JIT: exiting: void jit_langhook_write_globals()
 JIT:exiting: toplev::main
 JIT:entering: void 
gcc::jit::playback::context::extract_any_requested_dumps(vec*)
 JIT:exiting: void 
gcc::jit::playback::context::extract_any_requested_dumps(vec*)
@@ -129,13 +124,16 @@ JIT:exiting: toplev::finalize
 JIT:entering: virtual void 
gcc::jit::playback::compile_to_memory::postprocess(const char*)
 JIT: entering: void gcc::jit::playback::context::convert_to_dso(const 
c

Ping: [fr30] Fix indirect_jump pattern

2015-07-23 Thread Richard Sandiford
Ping

Richard Sandiford  writes:
> The pattern was accepting a nonimediate_operand, using the C condition
> to weed out certain types of memory, but was then using an "r" constraint
> to force a register.  This patch makes the predicate match the constraint
> and removes the C condition.
>
> Tested by building fr30-elf and using:
>
> int
> foo (int i)
> {
>   __typeof(&&a) foo[] = { &&a, &&a, &&b, &&c };
>
>  restart:
>   goto *foo[i];
>
>  a:
>   return 1;
>
>  b:
>   i += 1;
>   goto restart;
>
>  c:
>   return 2;
> }
>
> to triger an indirect jump (checked via -dp).  OK to install?
>
> Thanks,
> Richard
>
>
> gcc/
>   * config/fr30/fr30.md (indirect_jump): Use pmode_register_operand
>   instead of nonimmediate_operand.  Remove C condiition.
>
> Index: gcc/config/fr30/fr30.md
> ===
> --- gcc/config/fr30/fr30.md   2015-06-22 14:02:15.165532334 +0100
> +++ gcc/config/fr30/fr30.md   2015-07-13 19:31:50.552692732 +0100
> @@ -1146,8 +1146,8 @@ (define_insn "jump"
>  
>  ;; Indirect jump through a register
>  (define_insn "indirect_jump"
> -  [(set (pc) (match_operand:SI 0 "nonimmediate_operand" "r"))]
> -  "GET_CODE (operands[0]) != MEM || GET_CODE (XEXP (operands[0], 0)) != PLUS"
> +  [(set (pc) (match_operand 0 "pmode_register_operand" "r"))]
> +  ""
>"jmp%#\\t@%0"
>[(set_attr "delay_type" "delayed")]
>  )


[PATCH, i386]: Rewrite sysv_va_list_type_node and ms_va_list_type_node initialization.

2015-07-23 Thread Uros Bizjak
Hello!

This patch rewrites convoluted initialization of
sysv_va_list_type_node and ms_va_list_type_node globals.

2015-07-23  Uros Bizjak  

* config/i386/i386.c (ix86_build_builtin_va_list_64): Rename
from ix86_build_builtin_va_list_abi.  Handle only 64bit non-MS_ABI
targets here.
(ix86_build_builtin_va_list): Rewrite sysv_va_list_type_node and
ms_va_list_type_node initialization.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 226118)
+++ config/i386/i386.c  (working copy)
@@ -8744,18 +8744,11 @@ ix86_return_in_memory (const_tree type, const_tree
 
 /* Create the va_list data type.  */
 
-/* Returns the calling convention specific va_list date type.
-   The argument ABI can be DEFAULT_ABI, MS_ABI, or SYSV_ABI.  */
-
 static tree
-ix86_build_builtin_va_list_abi (enum calling_abi abi)
+ix86_build_builtin_va_list_64 (void)
 {
   tree f_gpr, f_fpr, f_ovf, f_sav, record, type_decl;
 
-  /* For i386 we use plain pointer to argument area.  */
-  if (!TARGET_64BIT || abi == MS_ABI)
-return build_pointer_type (char_type_node);
-
   record = lang_hooks.types.make_type (RECORD_TYPE);
   type_decl = build_decl (BUILTINS_LOCATION,
  TYPE_DECL, get_identifier ("__va_list_tag"), record);
@@ -8800,43 +8793,25 @@ static tree
 static tree
 ix86_build_builtin_va_list (void)
 {
-  tree ret = ix86_build_builtin_va_list_abi (ix86_abi);
-
-  /* Initialize abi specific va_list builtin types.  */
   if (TARGET_64BIT)
 {
-  tree t;
-  if (ix86_abi == MS_ABI)
-{
-  t = ix86_build_builtin_va_list_abi (SYSV_ABI);
-  if (TREE_CODE (t) != RECORD_TYPE)
-t = build_variant_type_copy (t);
-  sysv_va_list_type_node = t;
-}
-  else
-{
-  t = ret;
-  if (TREE_CODE (t) != RECORD_TYPE)
-t = build_variant_type_copy (t);
-  sysv_va_list_type_node = t;
-}
-  if (ix86_abi != MS_ABI)
-{
-  t = ix86_build_builtin_va_list_abi (MS_ABI);
-  if (TREE_CODE (t) != RECORD_TYPE)
-t = build_variant_type_copy (t);
-  ms_va_list_type_node = t;
-}
-  else
-{
-  t = ret;
-  if (TREE_CODE (t) != RECORD_TYPE)
-t = build_variant_type_copy (t);
-  ms_va_list_type_node = t;
-}
+  /* Initialize ABI specific va_list builtin types.  */
+  tree sysv_va_list, ms_va_list;
+
+  sysv_va_list = ix86_build_builtin_va_list_64 ();
+  sysv_va_list_type_node = build_variant_type_copy (sysv_va_list);
+
+  /* For MS_ABI we use plain pointer to argument area.  */
+  ms_va_list = build_pointer_type (char_type_node);
+  ms_va_list_type_node = build_variant_type_copy (ms_va_list);
+
+  return (ix86_abi == MS_ABI) ? ms_va_list : sysv_va_list;
 }
-
-  return ret;
+  else
+{
+  /* For i386 we use plain pointer to argument area.  */
+  return build_pointer_type (char_type_node);
+}
 }
 
 /* Worker function for TARGET_SETUP_INCOMING_VARARGS.  */


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu  wrote:
> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>  wrote:
>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
>>> Yeah.  Thanks, I've tested it with this change, and I'm now checking
>>> this in (full patch first; adjusted incremental patch at the end):
>>
>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>> (at least, it seems to be this patch, I haven't actually checked).
>>
>
> It also caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>

and maybe:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

-- 
H.J.


Re: [patch] PR66714 -- Re: Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-23 Thread Cesar Philippidis
On 07/23/2015 08:32 AM, Jakub Jelinek wrote:
> On Thu, Jul 23, 2015 at 08:20:50AM -0700, Cesar Philippidis wrote:
>> The attached patch does just that; it teaches
>> replace_block_vars_by_duplicates to replace the decls inside the
>> value-exprs with a duplicate too. It's kind of messy though. At the
>> moment I'm only considering VAR_DECL, PARM_DECL, RESULT_DECL, ADDR_EXPR,
>> ARRAY_REF, COMPONENT_REF, CONVERT_EXPR, NOP_EXPR, INDIRECT_REF and
>> MEM_REFs. I suspect that I may be missing some, but these are the only
>> ones that were triggered gcc_unreachable during testing.
> 
> Ugh, that looks ugly, why do we have all the tree walkers?
> I'd unshare_expr the value expr first, you really don't want to share
> it anyway, and then just walk_tree and find all the decls in there
> (with *walk_subtrees on types and perhaps something else too) and for them
> replace_by_duplicate_decl (tp, vars_map, to_context);

Something like the attached patch? Why do TREE_TYPEs need special handling?

Is it OK for trunk?

Cesar
2015-07-23  Cesar Philippidis  

	gcc/
	* tree-cfg.c (struct replace_decls_d): New struct.
	(replace_block_vars_by_duplicates_1): New function.
	(replace_block_vars_by_duplicates): Use it to replace the decls
	in the value exprs by duplicates.

	libgomp/
	* testsuite/libgomp.c/pr66714.c: New test.


diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index fde7fbc..900274a 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -70,6 +70,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-low.h"
 #include "tree-cfgcleanup.h"
 #include "wide-int-print.h"
+#include "gimplify.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
for a function tree.  */
@@ -108,6 +109,13 @@ struct cfg_stats_d
 
 static struct cfg_stats_d cfg_stats;
 
+/* Data to pass to replace_block_vars_by_duplicates_1.  */
+struct replace_decls_d
+{
+  hash_map *vars_map;
+  tree to_context;
+};
+
 /* Hash table to store last discriminator assigned for each locus.  */
 struct locus_discrim_map
 {
@@ -6897,6 +6905,29 @@ new_label_mapper (tree decl, void *data)
   return m->to;
 }
 
+/* Tree walker to replace the decls used inside value expressions by
+   duplicates.  */
+
+static tree
+replace_block_vars_by_duplicates_1 (tree *tp, int *walk_subtrees, void *data)
+{
+  struct replace_decls_d *rd = (struct replace_decls_d *)data;
+
+  switch (TREE_CODE (*tp))
+{
+case VAR_DECL:
+case PARM_DECL:
+case RESULT_DECL:
+  replace_by_duplicate_decl (tp, rd->vars_map, rd->to_context);
+  *walk_subtrees = 0;
+  break;
+default:
+  break;
+}
+
+  return NULL;
+}
+
 /* Change DECL_CONTEXT of all BLOCK_VARS in block, including
subblocks.  */
 
@@ -6916,7 +6947,11 @@ replace_block_vars_by_duplicates (tree block, hash_map *vars_map,
 	{
 	  if (TREE_CODE (*tp) == VAR_DECL && DECL_HAS_VALUE_EXPR_P (*tp))
 	{
-	  SET_DECL_VALUE_EXPR (t, DECL_VALUE_EXPR (*tp));
+	  tree x = DECL_VALUE_EXPR (*tp);
+	  struct replace_decls_d rd = { vars_map, to_context };
+	  unshare_expr (x);
+	  walk_tree (&x, replace_block_vars_by_duplicates_1, &rd, NULL);
+	  SET_DECL_VALUE_EXPR (t, x);
 	  DECL_HAS_VALUE_EXPR_P (t) = 1;
 	}
 	  DECL_CHAIN (t) = DECL_CHAIN (*tp);
diff --git a/libgomp/testsuite/libgomp.c/pr66714.c b/libgomp/testsuite/libgomp.c/pr66714.c
new file mode 100644
index 000..c9af4a9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr66714.c
@@ -0,0 +1,17 @@
+/* { dg-do "compile" } */
+/* { dg-additional-options "--param ggc-min-expand=0" } */
+/* { dg-additional-options "--param ggc-min-heapsize=0" } */
+/* { dg-additional-options "-g" } */
+
+/* Minimized from on target-2.c.  */
+
+void
+fn3 (int x)
+{
+  double b[3 * x];
+  int i;
+#pragma omp target
+#pragma omp parallel for
+  for (i = 0; i < x; i++)
+b[i] += 1;
+}


Re: [patch] PR66714 -- Re: Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-23 Thread Jakub Jelinek
On Thu, Jul 23, 2015 at 03:01:25PM -0700, Cesar Philippidis wrote:
> On 07/23/2015 08:32 AM, Jakub Jelinek wrote:
> > On Thu, Jul 23, 2015 at 08:20:50AM -0700, Cesar Philippidis wrote:
> >> The attached patch does just that; it teaches
> >> replace_block_vars_by_duplicates to replace the decls inside the
> >> value-exprs with a duplicate too. It's kind of messy though. At the
> >> moment I'm only considering VAR_DECL, PARM_DECL, RESULT_DECL, ADDR_EXPR,
> >> ARRAY_REF, COMPONENT_REF, CONVERT_EXPR, NOP_EXPR, INDIRECT_REF and
> >> MEM_REFs. I suspect that I may be missing some, but these are the only
> >> ones that were triggered gcc_unreachable during testing.
> > 
> > Ugh, that looks ugly, why do we have all the tree walkers?
> > I'd unshare_expr the value expr first, you really don't want to share
> > it anyway, and then just walk_tree and find all the decls in there
> > (with *walk_subtrees on types and perhaps something else too) and for them
> > replace_by_duplicate_decl (tp, vars_map, to_context);
> 
> Something like the attached patch? Why do TREE_TYPEs need special handling?

They can have decls in various places like TYPE_SIZE_UNIT, TYPE_SIZE, the
bounds of TYPE_DOMAIN etc. and I believe you generally don't want to replace
those.  Plus you risk infinite recursion then (unless 
walk_tree_without_duplicates).
Most walk_tree callbacks just do something like
  if (IS_TYPE_OR_DECL_P (*tp))
*walk_subtrees = 0;

Jakub


Re: PR c/16351 Extend Wnonnull for returns_nonnull

2015-07-23 Thread Bernhard Reutner-Fischer
On July 23, 2015 7:43:51 PM GMT+02:00, Jeff Law  wrote:
>On 07/22/2015 09:29 AM, Manuel López-Ibáñez wrote:
>> While looking at PR c/16351, I noticed that all tests proposed for
>> -Wnull-attribute
>> (https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01715.html) could be
>> warned from the FEs by simply extending the existing Wnonnull.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu.
>>
>> OK?
>>
>>
>> gcc/ChangeLog:
>>
>> 2015-07-22  Manuel López-Ibáñez  
>>
>>  PR c/16351
>>  * doc/invoke.texi (Wnonnull): Document behavior for
>>  returns_nonnull.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2015-07-22  Manuel López-Ibáñez  
>>
>>  PR c/16351
>>  * c-c++-common/wnonnull-1.c: New test.
>>
>> gcc/cp/ChangeLog:
>>
>> 2015-07-22  Manuel López-Ibáñez  
>>
>>  PR c/16351
>>  * typeck.c (check_return_expr): Call maybe_warn_returns_nonnull.
>>
>>
>> gcc/c-family/ChangeLog:
>>
>> 2015-07-22  Manuel López-Ibáñez  
>>
>>  PR c/16351
>>  * c-common.c (maybe_warn_returns_nonnull): New.
>>  * c-common.h (maybe_warn_returns_nonnull): Declare.
>>
>> gcc/c/ChangeLog:
>>
>> 2015-07-22  Manuel López-Ibáñez  
>>
>>  PR c/16351
>>  * c-typeck.c (c_finish_return): Call maybe_warn_returns_nonnull.
>FWIW, we have the usual tension here between warning in the front-end
>vs 
>warning after optimization and exploiting dataflow analysis.
>
>Warning in the front-ends like this can generate false positives (such 
>as a NULL return in an unreachable path and miss cases where the NULL 
>has to be propagated into the return by later optimizations.
>
>However warning in the front-ends will tend to have more stable 
>diagnostics from release to release.
>
>Warning after optimization/analysis will tend to generate fewer false 
>positives and can pick up cases where the didn't explicitly appear in 
>the return statement, but had to be propagated in by the optimizers. Of
>
>course, these warnings are less stable release-to-release and require 
>the optimizers/analysis phases to be run.
>
>I've always preferred exploiting optimization and analysis to both 
>reduce false positives and expose the non-trivial which show up via 
>optimizations.  But I also understand that's simply my preference and 
>that others have a different preference.
>
>I'll tentatively approve for the trunk, but I think we still want 
>warnings after optimization/analysis.  Which will likely lead to a 
>scheme like I proposed many years for uninitialized variables where we 
>have multiple modes.  One warns in the front-end like your implemention
>
>does, the other defers the warning until after analysis & optimization.
>
>So please keep 16351 open after committing.

-W{no-,}{f,m}e IIRC was proposed before. Won't help https://gcc.gnu.org/PR55035 
though where struct stores just escape too much -- AFAIU -- but still..

Thanks,



Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread David Edelsohn
On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu  wrote:
> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu  wrote:
>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>>  wrote:
>>> On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):
>>>
>>> Unfortunately it causes about a thousand test fails on powerpc64-linux
>>> (at least, it seems to be this patch, I haven't actually checked).
>>>
>>
>> It also caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>
>
> and maybe:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

I request that this patch be reverted (again).

Thanks, David


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn  wrote:
> On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu  wrote:
>> On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu  wrote:
>>> On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
>>>  wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
> Yeah.  Thanks, I've tested it with this change, and I'm now checking
> this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).

>>>
>>> It also caused:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978
>>>
>>
>> and maybe:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
>
> I request that this patch be reverted (again).

And I request to test any new patches under x32 before checking in.
You can use Ubuntu 14 to test x32.

Thanks.

-- 
H.J.


Re: [RFC, PATCH] nonzero attribute, static array parameter

2015-07-23 Thread Martin Uecker

Hi Marek,

sorry for taking so long to respond.

Fri, 15 May 2015 15:38:43 +0200
Marek Polacek :

> On Sat, May 09, 2015 at 09:42:23AM -0700, Martin Uecker wrote:
> > here is a tentative patch to implement a new attribute nonzero,
> > which is similar to nonnull, but is not a function attribute
> > but a type attribute.
> > 
> > One reason is that nonnull is awkward to use. For this reason,
> > clang allows the use of nonnull in function parameters, but this
> > is incompatible with old and current use of this attribute in gcc
> > (though in a rather obscure use case).
> > See: https://gcc.gnu.org/ml/gcc/2015-05/msg00077.html
>  
> Sorry, I quite fail to see how such an attribute would be useful.

First of all, it makes it much simpler to implement the desired warnings
and ubsan checks when passing NULL as an array parameters with 'static'.
It avoids implementing all this argument walking code which is needed
for nonnull. If we do not want a nonzero attribute, could we then
consider an internal attribute (e.g. "non zero")?

> It seems that the nonzero warning can only ever trigger when used
> on function parameters or on a return value, much as the nonnull / 
> returns_nonnull attributes. 

So far. But in the future this could be extended to other
cases where it makes sense.

> The difference is that you can use
> the nonzero attribute on a particular function parameter, but the
> nonnull attribute has this ability as well:
> 
> __attribute__ ((nonnull (1))) void foo (int *, int *);
> void
> bar (void)
> {
>   foo (0, 0);
> }

I know. But this is ugly, cumbersome, and fragile, and some uses
are incompatible with clang. That nonnull is mis-designed
has been pointed out before:

https://gcc.gnu.org/ml/gcc/2006-04/msg00551.html

> Unlike nonnull, nonzero attribute can be attached to a typedef, but
> it doesn't seem to buy you anything you couldn't do with the nonnull /
> returns_nonnull attributes.

It is much more expressive. A nonnull pointer _type_ seems really
useful to me. Many programming languages have something like this.

> The nonzero attribute can appertain even to integer types, not only
> pointer types.  Can you give an example where this would come in handy?
> It doesn't seem too useful to me.

One example:

void assert(__attribute__((nonzero)) int i);

would give a warning if it is already known at compile time
that the expression is zero.

> 
> +void foo1(int x[__attribute__((nonzero))]);
> 
> This looks weird, 

It does look weird. But this is already documented in this way:
https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html#Attribute-Syntax

The use of 'const' and 'static' in an array declarator as
defined by ISO C99 is already weird.

> the placement of the nonzero attribute here suggests
> that the array should have at least zero elements (the same that
> int x[static 0] does), but in fact it checks that the pointer passed
> to foo1 is non-NULL, i.e. something that could be easily achieved
> with the nonnull attribute.

In many case it would be much nicer to be able to declare a 
nonzero type and use it in interfaces instead of having to
add __attribute__((nonnull(1,2,5,8))) with exactly the right
numbers to a lot of prototypes.

> > The other reason is that a nonzero type attribute is conceptually
> > much simpler and at the same time more general than the existing
> > nonnull and nonnull_return function attributes (and could replace
> > both), e.g. we can define non-zero types and diagnose all stores
> > of known constant 0 to variables of this type, use this for 
> > computing better value ranges, etc.
>  
> Why would that be useful?  What makes integer 0 special?

0 is special in many ways. In particular it represents false
in an if clause. Also, why artificially restrict it to pointers
when it is trivial to make it work for integers too?


Martin





Re: [PATCH] warn for unsafe calls to __builtin_return_address

2015-07-23 Thread Jeff Law

On 06/11/2015 04:05 PM, Martin Sebor wrote:

Attached is an updated patch for both GCC and the manual.

The patch implements the suggested warning, -Wbuiltin-address,
that issues diagnostics for unsafe calls of the builtin address
functions.  Safe calls are those with arguments 0 or 1 anywhere
in a program and argument 2 outside of the main function (since
every function has main as its direct or indirect caller).

Tested on powerpc64le and x86_64 Linux.

Martin

The ChangeLog entries for gcc and testsuite:

2015-06-11  Martin Sebor  

 * c-family/c.opt (-Wbuiltin-address): New warning option.
 * doc/invoke.texi (Wbuiltin-address): Document it.
 * doc/extend.texi (__builtin_frame_addrress,
 __builtin_return_addrress):
 Clarify possible effects of calling the functions with
 non-zero arguments and mention -Wbuiltin-address.
 * builtins.c (expand_builtin_frame_address): Handle
 -Wbuiltin-address.

2015-06-11  Martin Sebor  

 * g++.dg/Wbuiltin-address-in-Wall.C: New test.
 * g++.dg/Wbuiltin-address.C: New test.
 * g++.dg/Wno-builtin-address.C: New test.
 * gcc.dg/Wbuiltin-address-in-Wall.c: New test.
 * gcc.dg/Wbuiltin-address.c: New test.
 * gcc.dg/Wno-builtin-address.c: New test.

PS A few notes about the changes.

There's the following comment in expand_builtin_frame_address:

   /* Some ports cannot access arbitrary stack frames.  */

just before a block of code where the function can lead to
an "invalid argument" warning which would cause the newly
added tests to fail (since the newly added warning wouldn't
be issued).

I tried to determine what ports these might be so I could add
conditionals to the tests to prevent false positives there but
couldn't find any.
You have to start thinking creatively.  For example, consider ports 
without dwarf2 unwinders :-)  Consider ports where the linker inserts 
stubs between caller & callee for various reasons  Consider cases where 
the next outermost frame is compiled by something other than GCC, or 
perhaps was written in pure assembly.  Or consider the case when we're 
currently in a signal handling frame...









I wanted to also issue a warning for calls at file scope with
arguments greater than 1 (just like in main) but couldn't find
a way to determine that.

I also wanted to make the special treatment of main conditional
on whether or not -ffreestanding is in effect but flag_hosted
is not declared in builtins.c and bringing it into scope seemed
like too much of a change.

I'd be happy to modify the patch and add any of the above if
someone can suggest a way to do it without disrupting too much
code.
I don't think this is really an issue with calling with a non-zero 
argument near the top of the stack, but more an issue of the general 
difficulty in unwinding beyond the current frame.


It's relatively easy for the compiler to get the return address of the 
current frame -- after all, the compiler knows everything about the 
current function and can even arrange to avoid things which may make 
finding the current frame's address difficult.


However, once you want to go back one frame, you're in an arbitrary hunk 
of code and digging the prior frame out can be non-trivial.


You can get a sense of the complexities by looking at the GDB code to 
find the saved pc and base of an arbitrary frame.  It's insane, 
especially in a world without dwarf2 unwind records.


So, my suggestion would be to warn for any call with a nonzero value.

Jeff


Re: Ping: [fr30] Fix indirect_jump pattern

2015-07-23 Thread Jeff Law

On 07/23/2015 03:28 PM, Richard Sandiford wrote:

Ping

Richard Sandiford  writes:

The pattern was accepting a nonimediate_operand, using the C condition
to weed out certain types of memory, but was then using an "r" constraint
to force a register.  This patch makes the predicate match the constraint
and removes the C condition.

Tested by building fr30-elf and using:

int
foo (int i)
{
   __typeof(&&a) foo[] = { &&a, &&a, &&b, &&c };

  restart:
   goto *foo[i];

  a:
   return 1;

  b:
   i += 1;
   goto restart;

  c:
   return 2;
}

to triger an indirect jump (checked via -dp).  OK to install?

Thanks,
Richard


gcc/
* config/fr30/fr30.md (indirect_jump): Use pmode_register_operand
instead of nonimmediate_operand.  Remove C condiition.

OK.

Jeff



Re: PR c/16351 Extend Wnonnull for returns_nonnull

2015-07-23 Thread Jeff Law

On 07/23/2015 04:44 PM, Bernhard Reutner-Fischer wrote:

On July 23, 2015 7:43:51 PM GMT+02:00, Jeff Law  wrote:

On 07/22/2015 09:29 AM, Manuel López-Ibáñez wrote:

While looking at PR c/16351, I noticed that all tests proposed for
-Wnull-attribute
(https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01715.html) could be
warned from the FEs by simply extending the existing Wnonnull.

Bootstrapped and regression tested on x86_64-linux-gnu.

OK?


gcc/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

  PR c/16351
  * doc/invoke.texi (Wnonnull): Document behavior for
  returns_nonnull.

gcc/testsuite/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

  PR c/16351
  * c-c++-common/wnonnull-1.c: New test.

gcc/cp/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

  PR c/16351
  * typeck.c (check_return_expr): Call maybe_warn_returns_nonnull.


gcc/c-family/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

  PR c/16351
  * c-common.c (maybe_warn_returns_nonnull): New.
  * c-common.h (maybe_warn_returns_nonnull): Declare.

gcc/c/ChangeLog:

2015-07-22  Manuel López-Ibáñez  

  PR c/16351
  * c-typeck.c (c_finish_return): Call maybe_warn_returns_nonnull.

FWIW, we have the usual tension here between warning in the front-end
vs
warning after optimization and exploiting dataflow analysis.

Warning in the front-ends like this can generate false positives (such
as a NULL return in an unreachable path and miss cases where the NULL
has to be propagated into the return by later optimizations.

However warning in the front-ends will tend to have more stable
diagnostics from release to release.

Warning after optimization/analysis will tend to generate fewer false
positives and can pick up cases where the didn't explicitly appear in
the return statement, but had to be propagated in by the optimizers. Of

course, these warnings are less stable release-to-release and require
the optimizers/analysis phases to be run.

I've always preferred exploiting optimization and analysis to both
reduce false positives and expose the non-trivial which show up via
optimizations.  But I also understand that's simply my preference and
that others have a different preference.

I'll tentatively approve for the trunk, but I think we still want
warnings after optimization/analysis.  Which will likely lead to a
scheme like I proposed many years for uninitialized variables where we
have multiple modes.  One warns in the front-end like your implemention

does, the other defers the warning until after analysis & optimization.

So please keep 16351 open after committing.


-W{no-,}{f,m}e IIRC was proposed before. Won't help https://gcc.gnu.org/PR55035 
though where struct stores just escape too much -- AFAIU -- but still..

Things like uninitialized variable analysis are inherently going to 
cause false positives.  It's just a fact of life.


Looking at the reduced testcase in that PR, I'm pretty sure its a bogus 
reduction.


We could have N == 0 when we encounter the head of the first loop.  So 
no members of temp[] will be initialized.  We then have the call to 
bar() which could change the value of N to 1.  We then hit the second 
loop and read temp[0] which is uninitialized.  The warning for the 
reduced testcase is correct.


If the original source has the same overall structure (an unbound write 
between the key loops), then the warning is correct.  If the writes can 
be proven to not clobber the loop bounds (such that the two key loops 
iterate over the same number of elements), then we'd have a false 
positive warning (and a failure of jump threading to discover the 
unexecutable path).


Jeff


Re: Fold some equal to and not equal to patterns in match.pd

2015-07-23 Thread Jeff Law

On 07/23/2015 10:33 AM, Segher Boessenkool wrote:

On Thu, Jul 23, 2015 at 10:09:49AM -0600, Jeff Law wrote:

It seems to me in these kind of cases that selection of the canonical
form should be driven by factors outside of which is better for a
particular target.  ie, which is simpler


I agree.  But neither form is simpler here, and we need to have both
forms in other contexts, so there is no real benefit to canonicalising.



a << N ==/!= 0

Looks like two operations.  A shift and a comparison against zero 
regardless of whether or not N is constant.



a&(-1>>N) ==/!= 0

For a varying N, this has a shift, logical and and comparison against zero.

For a constant N obviously the shift collapses to a constant and we're 
left with two operations again.


So for gimple, I'd prefer to see us using the a << N form.

If we need both forms in other contexts, we ought to be looking to 
eliminate that need :-)


If we go to the RTL level, then it's more complex -- it might depend on 
the constant produced by the -1 >> N operation, whether or not the 
target can shift by more than one bit at a time (H8/300 series is 
limited here for example), whether or not one operation sets condition 
codes in a useful way, potentially allowing the comparison to be 
removed, etc etc.  rtx_costs, even with its limitations is probably the 
way to drive selection of form for the RTL optimizers.



Jeff