Re: omp cleanup

2015-10-29 Thread Jakub Jelinek
On Wed, Oct 28, 2015 at 01:26:31PM -0700, Nathan Sidwell wrote:
> looking at the next thing to merge, I stumbled on code in lower_omp_target
> that appears at least confused.
> 
> we have:
>   if (offloaded || data_region)
> {  A }
>   else if (data_region)
> new_body = tgt_body;
>   if (offloaded || data_region)
> { B }
> 
> which can clearly be simplified to:
> 
>if (offloaded || data_region)
>  { A; B; }
> 
> If that's incorrect, is the first '|| data_region' wrong?
> 
> nathan

> 2015-10-28  Nathan Sidwell  
> 
>   * omp-low.c (lower_omp_target): Remove unreachable code & merge
>   ifs.

Your patch is fine.  The reason for the "|| data_region" addition has been
OMP_CLAUSE_USE_DEVICE_PTR clause support for OpenMP 4.5, without which
nothing will be added to new_body sequence, other than tgt_body, so it is
functionally equivalent in that case to the now unreachable code.

Ok for trunk, thanks.

Jakub


[PATCH] Fix PR56956

2015-10-29 Thread Richard Biener

This avoids introducing undefined overflow by not folding
unsigned conditional negation to ABS_EXPR.   IMHO we want a
well-defined ABS_EXPR with unsigned result at some point, but that
also needs target support (or auditing at least if we want to keep
the existing abs expanders).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-10-29  Richard Biener  

PR middle-end/56956
* fold-const.c (fold_cond_expr_with_comparison): Do not fold
unsigned conditonal negation to ABS_EXPR.

* c-c++-common/ubsan/pr56956.c: New testcase.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 229481)
--- gcc/fold-const.c(working copy)
*** fold_cond_expr_with_comparison (location
*** 4993,5000 
case GE_EXPR:
case GT_EXPR:
if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
! arg1 = fold_convert_loc (loc, signed_type_for
!  (TREE_TYPE (arg1)), arg1);
tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
return pedantic_non_lvalue_loc (loc, fold_convert_loc (loc, type, tem));
case UNLE_EXPR:
--- 4973,4979 
case GE_EXPR:
case GT_EXPR:
if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
! break;
tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
return pedantic_non_lvalue_loc (loc, fold_convert_loc (loc, type, tem));
case UNLE_EXPR:
*** fold_cond_expr_with_comparison (location
*** 5004,5011 
case LE_EXPR:
case LT_EXPR:
if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
! arg1 = fold_convert_loc (loc, signed_type_for
!  (TREE_TYPE (arg1)), arg1);
tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
return negate_expr (fold_convert_loc (loc, type, tem));
default:
--- 4983,4989 
case LE_EXPR:
case LT_EXPR:
if (TYPE_UNSIGNED (TREE_TYPE (arg1)))
! break;
tem = fold_build1_loc (loc, ABS_EXPR, TREE_TYPE (arg1), arg1);
return negate_expr (fold_convert_loc (loc, type, tem));
default:
Index: gcc/testsuite/c-c++-common/ubsan/pr56956.c
===
*** gcc/testsuite/c-c++-common/ubsan/pr56956.c  (revision 0)
--- gcc/testsuite/c-c++-common/ubsan/pr56956.c  (working copy)
***
*** 0 
--- 1,15 
+ /* { dg-do run } */
+ /* { dg-options "-fsanitize=undefined -fsanitize-undefined-trap-on-error" } */
+ 
+ unsigned int __attribute__((noinline,noclone))
+ foo (unsigned int x)
+ {
+   return x <= __INT_MAX__ ? x : -x;
+ }
+ 
+ int
+ main ()
+ {
+   volatile unsigned int tem = foo (-__INT_MAX__ - 1);
+   return 0;
+ }


Re: [gomp4] OpenACC first private

2015-10-29 Thread Thomas Schwinge
Hi!

On Mon, 3 Aug 2015 10:30:49 -0400, Nathan Sidwell  wrote:
> I've committed this patch to gomp4.  The existing implementation of 
> firstprivate 
> presumes the existence of memory at the CTA level.  This patch does away with 
> that, treating firstprivate as thread-private variables initialized from the 
> host.

> --- gcc/fortran/openmp.c  (revision 226462)
> +++ gcc/fortran/openmp.c  (working copy)
> @@ -586,22 +586,12 @@ gfc_match_omp_clauses (gfc_omp_clauses *
> &c->lists[OMP_LIST_PRIVATE], true)
>== MATCH_YES)
>   continue;
> -  if (mask & OMP_CLAUSE_FIRSTPRIVATE)
> - {
> -   if (openacc)
> - {
> -   if (gfc_match ("firstprivate ( ") == MATCH_YES
> -   && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -OMP_MAP_GANGLOCAL, false))

Turns out, this has been the last (only?) user of
gfc_match_omp_map_clause to specify false for "allow_sections".  We once
had added the latter; removed on gomp-4_0-branch in r229516:

commit 64fec7e145a784ec1e5844a8296e8a39aeea092d
Author: tschwinge 
Date:   Thu Oct 29 08:21:11 2015 +

Cleanup: gfc_match_omp_map_clause

gcc/fortran/
* openmp.c (gfc_match_omp_map_clause): Remove allow_sections
formal parameter.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229516 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog.gomp | 5 +
 gcc/fortran/openmp.c   | 8 +++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git gcc/fortran/ChangeLog.gomp gcc/fortran/ChangeLog.gomp
index 2cc161a..7fe3eac 100644
--- gcc/fortran/ChangeLog.gomp
+++ gcc/fortran/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2015-10-29  Thomas Schwinge  
+
+   * openmp.c (gfc_match_omp_map_clause): Remove allow_sections
+   formal parameter.
+
 2015-10-28  Cesar Philippidis  
 
* trans-openmp.c (gfc_filter_oacc_combined_clauses): Don't zero-
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index afcce9a..a2c5105 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -482,12 +482,10 @@ match_oacc_clause_gang (gfc_omp_clauses *cp)
mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
- bool allow_sections = true)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head,
-  allow_sections)
+  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
   == MATCH_YES)
 {
   gfc_omp_namelist *n;
@@ -592,7 +590,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
  && gfc_match_omp_variable_list ("firstprivate (",
  &c->lists[OMP_LIST_FIRSTPRIVATE],
  true)
- == MATCH_YES)
+== MATCH_YES)
continue;
   if ((mask & OMP_CLAUSE_LASTPRIVATE)
  && gfc_match_omp_variable_list ("lastprivate (",


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [gomp4 00/14] NVPTX: further porting

2015-10-29 Thread Jakub Jelinek
On Wed, Oct 28, 2015 at 08:19:19PM +0300, Alexander Monakov wrote:
> 
> 
> On Wed, 21 Oct 2015, Jakub Jelinek wrote:
> 
> > On Wed, Oct 21, 2015 at 12:16:35PM +0300, Alexander Monakov wrote:
> > > > Of course that doesn't help the thread-limit-2.c testcase.
> > > 
> > > Why not?
> > 
> > Because the compiler can be configured for multiple offloading devices,
> > and PTX might not be the first device.  So, you'd need to have a tcl
> > test whether PTX is enabled at all rather than whether it is the default
> > device.
> 
> My previous response was a bit confused so let me correct myself.
> 
> Checking whether PTX is enabled as an offload target is relatively easy in
> libgomp DejaGNU harness: just inspect $offload_targets_s.  This helps to XFAIL
> the test that would fail at compile time, but such XFAIL'ing is, like you
> said, undesirable because it drops the test for all offload targets.  I'd
> rather provide a dummy 'usleep' under #ifdef __nvptx__.  WDYT?

Such ifdefs aren't really easily possible in OpenMP right now, the
preprocessing is done with the host compiler only, you'd need to arrange for
usleep being defined only in the PTX path and nowhere else.
> 
> On the other hand, checking whether PTX will be the default device when
> running the compiled test seems non-trivial.

The OpenMP standard indeed does not have a function which would return you
an enum on what offloading device it is run on, it would need to be an
extension (so a header different from omp.h and perhaps gomp_* prefixed
function), or we could just use some OpenACC function for that?
What is easily possible to distinguish is offloading vs. host fallback
(omp_is_initial_device ()), and whether it has shared address space or
separate address space (so HSA vs. PTX + XeonPhi + XeonPhi emul), and
several tests already check those two properties.

> I've updated my patches locally in response to reviews to the last series,
> except where shared memory is involved.  Should I resend?

Are there any dependencies in your patch series against stuff still in
gomp-4_0-branch rather than already on the trunk?

Jakub


Re: [PATCH, VECTOR ABI] Add __attribute__((__simd__)) to GCC.

2015-10-29 Thread Jakub Jelinek
On Wed, Oct 28, 2015 at 12:16:04PM +0300, Kirill Yukhin wrote:
> Bootstrapped. Regtested. Is it ok for trunk?
> 
> 
> gcc/
> * omp-low.c (pass_omp_simd_clone::gate): If target allows - call
> without additional conditions.
> * doc/extend.texi (simd): Document new attribute.
> gcc/cp/
> * parser.h (cp_parser): Add simd_attr_present.
> * parser.c (cp_parser_late_return_type_opt): Handle simd_attr_present,
> require comman in __vector__ attribute.
> (cp_parser_gnu_attribute_list): Ditto.
> gcc/c/
> * c-parser.c (c_parser): Add simd_attr_present flag.
> (c_parser_declaration_or_fndef): Call c_parser_declaration_or_fndef
> if simd_attr_present is set.
> (c_finish_omp_declare_simd): Handle simd_attr_present.

Actually, do you plan to eventually add some clauses/operands to the simd
attribute, or is the plan to just say that simd attribute is
#pragma omp declare simd
with no clauses as if -fopenmp-simd has been enabled?
If you don't plan to add any clauses, I wonder whether you really need to
add any parser changes at all, whether this couldn't be all handled in
c-family/c-common.c - handle_simd_attribute, adding simd to the attribute
table in there as a function decl attribute, and simply when processing it
add
tree c = build_tree_list (get_identifier ("omp declare simd"), 
NULL_TREE);
TREE_CHAIN (c) = DECL_ATTRIBUTES (fndecl);
DECL_ATTRIBUTES (fndecl) = c;
(after checking whether the attribute isn't already present and erroring out
if there is "cilk simd function" attribute).
The reason for the (admittedly ugly) parser changes for #pragma omp declare 
simd is
that the clauses on the directive refer to parameters that will be declared
later, so we need to save the tokens of the pragma and then after parsing
the parameter declarations actually parse the clauses.  But, in the simd
attribute case, there are no clauses, there is nothing to parse later.

Sorry for not raising this earlier.

Jakub


Re: [gomp4] fortran cleanups and c/c++ loop parsing backport

2015-10-29 Thread Thomas Schwinge
Hi!

On Wed, 28 Oct 2015 08:30:56 -0700, Cesar Philippidis  
wrote:
> On 10/28/2015 04:00 AM, Thomas Schwinge wrote:
> > On Tue, 27 Oct 2015 11:36:10 -0700, Cesar Philippidis 
> >  wrote:
> >> This patch contains the following:
> >>
> >>   * C front end changes from trunk:
> >> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02528.html
> >>
> >>   * C++ front end changes from trunk:
> >> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02540.html
> >>
> >>   * Proposed fortran cleanups and enhanced error reporting changes:
> >> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02288.html
> > 
> > I suppose the latter is a prerequisite for other Fortran front end
> > changes you've also committed?  Otherwise, why not get that patch into
> > trunk first?  That sould save me from having to deal with potentially
> > more merge conflicts later on...
> 
> It wasn't necessarily a prerequisite for these changes, but I've been
> trying to get that patch into trunk for a while now.

Generally, please get such "widespread" patches (touching a lot of code,
even if trivially) into trunk first.  With every merge from trunk into
gomp-4_0-branch I'll now have to make sure to adjust every new trunk code
to cooperate with your patch.

Jakub, can you please review
?

> Plus, part of those
> cleanups touched declare, which Jim is going to work on soon.

I don't understand how that relates.

Anyway, the patch is now committed on gomp-4_0-branch, and hopefully will
land in trunk soon; let's move on.


> >> --- a/gcc/fortran/trans-openmp.c
> >> +++ b/gcc/fortran/trans-openmp.c

> >> +/* Helper function to filter combined oacc constructs.  ORIG_CLAUSES
> >> +   contains the unfiltered list of clauses.  LOOP_CLAUSES corresponds to
> >> +   the filter list of loop clauses corresponding to the enclosed list.
> >> +   This function is called recursively to handle device_type clauses.  */
> >> +
> >> +static void
> >> +gfc_filter_oacc_combined_clauses (gfc_omp_clauses **orig_clauses,
> >> +gfc_omp_clauses **loop_clauses)

> >> +  (*loop_clauses)->acc_collapse = (*orig_clauses)->acc_collapse;
> >> +  (*orig_clauses)->acc_collapse = false;
> >> +  (*loop_clauses)->collapse = (*orig_clauses)->collapse;
> >> +  /* Don't reset (*orig_clauses)->collapse.  */
> > 
> > Why?  (Extend source code comment?)  The original code (cited just below)
> > did this differently.
> 
> Because that's what gfc_split_omp_clauses does. I'm not sure what that's
> required for gfc_trans_omp_do, but it is. gfc_trans_omp_do appears to be
> operating on two sets of clauses for some non-obvious reason.
> 
> >> @@ -3661,39 +3714,18 @@ gfc_trans_oacc_combined_directive (gfc_code *code)

> >> -  loop_clauses.collapse = construct_clauses.collapse;
> >> -  [...]
> >> -  construct_clauses.collapse = 0;

(The "old" behavior is also what the C/C++ front ends are doing, by the
way, so that would also need to be updated.)

Assuming I'm looking at the correct thing, the collapse clause handling
in gfc_split_omp_clauses says "duplicate collapse" which is more
expressive to me than "don't reset (*orig_clauses)->collapse".  I haven't
researched the requirement for OpenMP, but I don't see how a collapse
clause would be relevant to have on an OpenACC parallel or kernels
construct?  Which detail do I overlook?


> > With...
> > 
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
> > 
> > ... newly added, and...
> > 
> >> --- a/libgomp/testsuite/libgomp.oacc-fortran/combdir-1.f90
> >> +++ /dev/null
> > 
> > ... renamed to...
> > 
> >> --- /dev/null
> >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/combined-directive-1.f90
> > 
> > ... this, plus the unaltered
> > libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c, we now got three
> > variants: "combined-directives", "combined-directive", and "combdir".
> > Please settle on one of them.
> 
> The problem here was I didn't know what compdir stood for

*Comb*ined *dir*ectives, I guessed.

> so I renamed
> it to combined-directive-foo. I guess I didn't keep the name consistent
> when I introduced a new compiler time test. The attached patch adjusts
> the test in libgomp to make it consistent with it's compile time test.

Again, to not make merges needlessly difficult, the preference is to do
such changes on trunk first.  (Which I'll now do in the following, also
taking care to rename
libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c in the same
way...).


> >> --- a/gcc/testsuite/gfortran.dg/gomp/intentin1.f90
> >> +++ b/gcc/testsuite/gfortran.dg/gomp/intentin1.f90
> >> @@ -11,6 +11,6 @@ subroutine foo (x)
> >>  !$omp simd linear (x) ! { dg-error "INTENT.IN. 
> >> POINTER" }
> >>do i = 1, 10
> >>end do
> >> -!$omp single  ! { dg-error "INTENT.IN. 
> >> POINTER" }
> >> -!$omp end single copyprivate (x)
> >> +

Improve filenames for test cases of OpenACC combined directives (was: [gomp4] fortran cleanups and c/c++ loop parsing backport)

2015-10-29 Thread Thomas Schwinge
Hi!

On Thu, 29 Oct 2015 09:56:18 +0100, I wrote:
> On Wed, 28 Oct 2015 08:30:56 -0700, Cesar Philippidis 
>  wrote:
> > On 10/28/2015 04:00 AM, Thomas Schwinge wrote:
> > > On Tue, 27 Oct 2015 11:36:10 -0700, Cesar Philippidis 
> > >  wrote:

> > > With...
> > > 
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
> > > 
> > > ... newly added, and...
> > > 
> > >> --- a/libgomp/testsuite/libgomp.oacc-fortran/combdir-1.f90
> > >> +++ /dev/null
> > > 
> > > ... renamed to...
> > > 
> > >> --- /dev/null
> > >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/combined-directive-1.f90
> > > 
> > > ... this, plus the unaltered
> > > libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c, we now got three
> > > variants: "combined-directives", "combined-directive", and "combdir".
> > > Please settle on one of them.
> > 
> > The problem here was I didn't know what compdir stood for
> 
> *Comb*ined *dir*ectives, I guessed.
> 
> > so I renamed
> > it to combined-directive-foo. I guess I didn't keep the name consistent
> > when I introduced a new compiler time test. The attached patch adjusts
> > the test in libgomp to make it consistent with it's compile time test.
> 
> Again, to not make merges needlessly difficult, the preference is to do
> such changes on trunk first.  (Which I'll now do in the following, also
> taking care to rename
> libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c in the same
> way...).

As obvious, committed to trunk in r229518:

commit e8ec2b06072a532b8d7319ed598ac1dad2584fea
Author: tschwinge 
Date:   Thu Oct 29 09:03:40 2015 +

Improve filenames for test cases of OpenACC combined directives

libgomp/
* testsuite/libgomp.oacc-c-c++-common/combdir-1.c: Rename to...
* testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c:
... this.  Add a description of the test at the top of the file.
* testsuite/libgomp.oacc-fortran/combdir-1.f90: Rename file to...
* testsuite/libgomp.oacc-fortran/combined-directives-1.f90:
... this.  Add a description of the test at the top of the file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@229518 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  | 10 ++
 .../{combdir-1.c => combined-directives-1.c}   |  2 ++
 .../{combdir-1.f90 => combined-directives-1.f90}   |  2 ++
 3 files changed, 14 insertions(+)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index 8f44af0..c78881b 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,13 @@
+2015-10-29  Thomas Schwinge  
+   Cesar Philippidis  
+
+   * testsuite/libgomp.oacc-c-c++-common/combdir-1.c: Rename to...
+   * testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c:
+   ... this.  Add a description of the test at the top of the file.
+   * testsuite/libgomp.oacc-fortran/combdir-1.f90: Rename file to...
+   * testsuite/libgomp.oacc-fortran/combined-directives-1.f90:
+   ... this.  Add a description of the test at the top of the file.
+
 2015-10-28  Nathan Sidwell  
 
* testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: New.
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c
similarity index 93%
rename from libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c
rename to libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c
index a7def92..dad6d13 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/combdir-1.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c
@@ -1,3 +1,5 @@
+/* This test exercises combined directives.  */
+
 /* { dg-do run } */
 
 #include 
diff --git libgomp/testsuite/libgomp.oacc-fortran/combdir-1.f90 
libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90
similarity index 92%
rename from libgomp/testsuite/libgomp.oacc-fortran/combdir-1.f90
rename to libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90
index 0cd8a67..94100b2 100644
--- libgomp/testsuite/libgomp.oacc-fortran/combdir-1.f90
+++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90
@@ -1,3 +1,5 @@
+! This test exercises combined directives.
+
 ! { dg-do run }
 
 program main


Grüße
 Thomas


signature.asc
Description: PGP signature


[ARM] libgcc: include crtfastmath.o

2015-10-29 Thread Christophe Lyon
Hi,

On arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets,
crtfastmath.o is not part of $extra_parts, which means that it is not
built, even though the make build rule is present via the inclusion of
t-crtfm.

The impact is that the link fails when using -ffast-math.

The attached patch adds crtfastmath.o to $extra_parts, for
arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets.

I tested on arm-none-eabi, and I am not sure if this is correct for
symbian/rtems.

OK? Or should I restrict the change to arm*-*-eabi* ?

Thanks,

Christophe.
2015-10-29  Christophe Lyon  

* config.host (arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems*):
Include crtfastmath.o.

diff --git a/libgcc/config.host b/libgcc/config.host
index 2ee92c1..b63cc36 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -414,6 +414,7 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*)
  ;;
esac
tmake_file="$tmake_file t-softfp-sfdf t-softfp-excl arm/t-softfp 
t-softfp"
+   extra_parts="$extra_parts crtfastmath.o"
unwind_header=config/arm/unwind-arm.h
;;
 avr-*-rtems*)


Re: PING: [PATCH] PR target/67215: -fno-plt needs improvements for x86

2015-10-29 Thread Ramana Radhakrishnan


On 29/10/15 01:47, H.J. Lu wrote:
> On Wed, Oct 28, 2015 at 6:21 PM, Bernd Schmidt  wrote:
>> On 10/29/2015 02:14 AM, H.J. Lu wrote:
>>>
>>> On Wed, Oct 28, 2015 at 6:11 PM, Bernd Schmidt 
>>> wrote:

 On 10/29/2015 02:10 AM, H.J. Lu wrote:
>
>
> On Wed, Oct 28, 2015 at 5:23 PM, Jeff Law  wrote:
>>
>>
>>
>> So I'll ask again, why did you commit a patch which you clearly knew
>> did
>> not
>> meet the conditions Bernd set forth for approval?
>
>
>
> I believed that aarch64 backend didn't properly handle -fno-plt,
> which shouldn't block my patch.
>>
>>
>> Actually this is even worse than I thought because it sounds like you're
>> saying you knowingly checked something in while being aware it would break
>> another port.
>>
> 
> Only when -fno-plt was used.

So, that's a target independent feature in GCC ! So, I don't see how that
justifies the commit.


>> 
>> Sometimes It seems that it is the only way to get attention from the
>> community.  BTW, my patch was submitted in August.
>> 

I personally do *not* understand how that is an excuse. 

regards
Ramana


 


Re: [PATCH] Fix PR66313

2015-10-29 Thread Richard Biener
On Tue, 27 Oct 2015, Joseph Myers wrote:

> On Tue, 27 Oct 2015, Richard Biener wrote:
> 
> > When factoring a*b + a*c to (b + c)*a we have to guard against the
> > case of a == 0 as after the factoring b + c might overflow in that
> > case.  Fixed by doing the addition in an unsigned type if required.
> 
> The same applies to a == -1 (consider b and c equal to -(INT_MIN/2), when 
> a*b + a*c is INT_MIN but b+c overflows).  In that case, if you avoid b+c 
> overflowing using an unsigned type, but convert back to signed for the 
> multiplication, you get a spurious overflow from the multiplication 
> instead.

Indeed.  So the following is as good as I can get it.

We still regress gcc.dg/tree-ssa/tailrecursion-6.c with it though.
There we have

int
foo (int a)
{
if (a)
return a * (2 * (foo (a - 1))) + a + 1;
else
return 0;
}

and tailrecursion detection requires the expression to be
in the form (2 * (foo (a - 1)) + 1) * a + 1 but folding
can't see that (2 * (foo (a - 1)) + 1) might not be INT_MIN
when a is -1.  Well, I can't trivially either, maybe its
just because of the recursion or maybe it's because here
we are sure that C in C * A is odd (2 * sth + 1) or
simply because we know that C in (B + C)*A is positive
and non-zero?

But I guess for the isolated view of the
a * (2 * X) + a expression we can't factor it using signed
arithmetic because of the a == 0 case as then the sum
might trivially overflow (of course here a is not zero
because of the guard...)

Bootstrap / regtest in progress on x86_64-unknown-linux-gnu.

More hole-punching appreciated.  Meanwhile I found PR68142...

Richard.

2015-10-27  Richard Biener  

PR middle-end/66313
* fold-const.c (fold_plusminus_mult_expr): If the factored
factor may be zero use a wrapping type for the inner operation.

* c-c++-common/ubsan/pr66313.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c.orig   2015-10-29 09:27:42.942302445 +0100
+++ gcc/fold-const.c2015-10-29 10:35:42.794378063 +0100
@@ -6916,10 +6916,11 @@ fold_plusminus_mult_expr (location_t loc
 }
   same = NULL_TREE;
 
-  if (operand_equal_p (arg01, arg11, 0))
-same = arg01, alt0 = arg00, alt1 = arg10;
-  else if (operand_equal_p (arg00, arg10, 0))
+  /* Prefer factoring a common non-constant.  */
+  if (operand_equal_p (arg00, arg10, 0))
 same = arg00, alt0 = arg01, alt1 = arg11;
+  else if (operand_equal_p (arg01, arg11, 0))
+same = arg01, alt0 = arg00, alt1 = arg10;
   else if (operand_equal_p (arg00, arg11, 0))
 same = arg00, alt0 = arg01, alt1 = arg10;
   else if (operand_equal_p (arg01, arg10, 0))
@@ -6974,14 +6975,36 @@ fold_plusminus_mult_expr (location_t loc
}
 }
 
-  if (same)
+  if (!same)
+return NULL_TREE;
+
+  if (! INTEGRAL_TYPE_P (type)
+  || TYPE_OVERFLOW_WRAPS (type)
+  /* We are neither factoring zero nor minus one.  */
+  || TREE_CODE (same) == INTEGER_CST)
 return fold_build2_loc (loc, MULT_EXPR, type,
fold_build2_loc (loc, code, type,
 fold_convert_loc (loc, type, alt0),
 fold_convert_loc (loc, type, alt1)),
fold_convert_loc (loc, type, same));
 
-  return NULL_TREE;
+  /* Same may be zero and thus the operation 'code' may overflow.  Likewise
+ same may be minus one and thus the multiplication may overflow.  Perform
+ the operations in an unsigned type.  */
+  tree utype = unsigned_type_for (type);
+  tree tem = fold_build2_loc (loc, code, utype,
+ fold_convert_loc (loc, utype, alt0),
+ fold_convert_loc (loc, utype, alt1));
+  /* If the sum evaluated to a constant that is not -INF the multiplication
+ cannot overflow.  */
+  if (TREE_CODE (tem) == INTEGER_CST
+  && ! wi::eq_p (tem, wi::min_value (TYPE_PRECISION (utype), SIGNED)))
+return fold_build2_loc (loc, MULT_EXPR, type,
+   fold_convert (type, tem), same);
+
+  return fold_convert_loc (loc, type,
+  fold_build2_loc (loc, MULT_EXPR, utype, tem,
+   fold_convert_loc (loc, utype, 
same)));
 }
 
 /* Subroutine of native_encode_expr.  Encode the INTEGER_CST
@@ -7835,6 +7858,7 @@ fold_unary_loc (location_t loc, enum tre
 
 case VIEW_CONVERT_EXPR:
   if (TREE_CODE (op0) == MEM_REF)
+   /* ???  Bogus for aligned types.  */
return fold_build2_loc (loc, MEM_REF, type,
TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));
 
Index: gcc/testsuite/c-c++-common/ubsan/pr66313.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ gcc/testsuite/c-c++-common/ubsan/pr66313.c  2015-10-29 10:37:11.256355126 
+0100
@@ -0,0 +1,26 @@
+/* 

Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-29 Thread Martin Liška
On 10/28/2015 04:23 PM, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
>> On 10/27/2015 03:49 PM, Richard Biener wrote:
>>> On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
 On 10/26/2015 02:48 PM, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
>> On 10/21/2015 04:06 PM, Richard Biener wrote:
>>> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
 On 10/21/2015 11:59 AM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  wrote:
>> On 10/20/2015 03:39 PM, Richard Biener wrote:
>>> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
>>> wrote:
 Hello.

 As part of upcoming merge of HSA branch, we would like to have 
 possibility to terminate
 pass manager after execution of the HSA generation pass. The HSA 
 back-end is implemented
 as a tree pass that directly emits HSAIL from gimple tree 
 representation. The pass operates
 on clones created by HSA IPA pass and the pass manager should stop 
 execution of further
 RTL passes.

 Suggested patch survives bootstrap and regression tests on 
 x86_64-linux-pc.

 What do you think about it?
>>>
>>> Are you sure it works this way?
>>>
>>> Btw, you will miss executing of all the cleanup passes that will
>>> eventually free memory
>>> associated with the function.  So I'd rather support a
>>> TODO_discard_function which
>>> should basically release the body from the cgraph.
>>
>> Hi.
>>
>> Agree with you that I should execute all TODOs, which can be easily 
>> done.
>> However, if I just try to introduce the suggested TODO and handle it 
>> properly
>> by calling cgraph_node::release_body, then for instance 
>> fn->gimple_df, fn->cfg are
>> released and I hit ICEs on many places.
>>
>> Stopping the pass manager looks necessary, or do I miss something?
>
> "Stopping the pass manager" is necessary after TODO_discard_function, 
> yes.
> But that may be simply done via a has_body () check then?

 Thanks, there's second version of the patch. I'm going to start 
 regression tests.
>>>
>>> As release_body () will free cfun you should pop_cfun () before
>>> calling it (and thus
>>
>> Well, release_function_body calls both push & pop, so I think calling pop
>> before cgraph_node::release_body is not necessary.
>
> (ugh).
>
>> If tried to call pop_cfun before cgraph_node::release_body, I have cfun 
>> still
>> pointing to the same (function *) (which is gcc_freed, but cfun != NULL).
>
> Hmm, I meant to call pop_cfun then after it (unless you want to fix the 
> above,
> none of the freeing functions should techincally need 'cfun', just add
> 'fn' parameters ...).
>
> I expected pop_cfun to eventually set cfun to NULL if it popped the
> "last" cfun.  Why
> doesn't it do that?
>
>>> drop its modification).  Also TODO_discard_functiuon should be only set 
>>> for
>>> local passes thus you should probably add a gcc_assert (cfun).
>>> I'd move its handling earlier, definitely before the ggc_collect, 
>>> eventually
>>> before the pass_fini_dump_file () (do we want a last dump of the
>>> function or not?).
>>
>> Fully agree, moved here.
>>
>>>
>>> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>>>  {
>>>gcc_assert (pass->type == GIMPLE_PASS
>>>   || pass->type == RTL_PASS);
>>> +
>>> +
>>> +  if (!gimple_has_body_p (current_function_decl))
>>> +   return;
>>>
>>> too much vertical space.  With popping cfun before releasing the body 
>>> the check
>>> might just become if (!cfun) and
>>
>> As mentioned above, as release body is symmetric (calling push & pop), 
>> the suggested
>> guard will not work.
>
> I suggest to fix it.  If it calls push/pop it should leave with the
> original cfun, popping again
> should make it NULL?
>
>>>
>>> @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass)
>>>  {
>>>push_cfun (fn);
>>>execute_pass_list_1 (pass);
>>> -  if (fn->cfg)
>>> +  if (gimple_has_body_p (current_function_decl) && fn->cfg)
>>>  {
>>>free_dominance_info (CDI_DOMINATORS);
>>>free_dominance_info (CDI_POST_DOMINATORS);
>>>
>>> here you'd need to guard the pop_cfun call on cfun != NULL.  IMHO it's 
>>> better
>>> to not let cfun point to deallocated data.
>>
>> As I've read the c

Re: [ARM] libgcc: include crtfastmath.o

2015-10-29 Thread Ramana Radhakrishnan
On Thu, Oct 29, 2015 at 9:25 AM, Christophe Lyon
 wrote:
> Hi,
>
> On arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets,
> crtfastmath.o is not part of $extra_parts, which means that it is not
> built, even though the make build rule is present via the inclusion of
> t-crtfm.
>
> The impact is that the link fails when using -ffast-math.
>
> The attached patch adds crtfastmath.o to $extra_parts, for
> arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets.
>
> I tested on arm-none-eabi, and I am not sure if this is correct for
> symbian/rtems.
>
> OK? Or should I restrict the change to arm*-*-eabi* ?


I don't see how they are inappropriate on those targets as
crtfastmath.c only set flush to zero in with -ffast-math when the
target architecture has floating point hardware and are only brought
in if the link spec asks for it (which it does only with -ffast-math)

Ok if no regressions.

regards
Ramana

>
> Thanks,
>
> Christophe.


Re: [PATCH] Don't handle CAST_RESTRICT (PR tree-optimization/49279)

2015-10-29 Thread Tom de Vries
[ quote-pasted from 
https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00464.html ]



CAST_RESTRICT based disambiguation unfortunately isn't reliable,
e.g. to store a non-restrict pointer into a restricted field,
we add a non-useless cast to restricted pointer in the gimplifier,
and while we don't consider that field to have a special restrict tag
because it is unsafe to do so, we unfortunately create it for the
CAST_RESTRICT before that and end up with different restrict tags
for the same thing.  See the PR for more details.

This patch turns off CAST_RESTRICT handling for now, in the future
we might try to replace it by explicit CAST_RESTRICT stmts in some form,
but need to solve problems with multiple inlined copies of the same function
with restrict arguments or restrict variables in it and intermixed code from
them (or similarly code from different non-overlapping source blocks).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
4.6 too?

2011-10-06  Jakub Jelinek  

PR tree-optimization/49279
* tree-ssa-structalias.c (find_func_aliases): Don't handle
CAST_RESTRICT.
* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Allow
restrict propagation.
* tree-ssa.c (useless_type_conversion_p): Don't return false
if TYPE_RESTRICT differs.

* gcc.dg/tree-ssa/restrict-4.c: XFAIL.
* gcc.c-torture/execute/pr49279.c: New test.


Hi,

In the patch adding support for CAST_RESTRICT ( 
https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00176.html ) there was 
also a bit:

...
* fold-const.c (fold_unary_loc): Don't optimize
POINTER_PLUS_EXPR casted to TYPE_RESTRICT pointer by
casting the inner pointer if it isn't TYPE_RESTRICT.
...
which is still around. I suppose we can remove this bit as well.

OK for trunk if bootstrap and reg-test succeeds?

Thanks,
- Tom


Allow CAST_RESTRICT folding

2015-10-29  Tom de Vries  

	* fold-const.c (fold_unary_loc): Remove folding inhibition for restrict
	types.
---
 gcc/fold-const.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c4be017..1cd911a 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -7728,7 +7728,6 @@ fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
 	 that this happens when X or Y is NOP_EXPR or Y is INTEGER_CST. */
   if (POINTER_TYPE_P (type)
 	  && TREE_CODE (arg0) == POINTER_PLUS_EXPR
-	  && (!TYPE_RESTRICT (type) || TYPE_RESTRICT (TREE_TYPE (arg0)))
 	  && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
 	  || TREE_CODE (TREE_OPERAND (arg0, 0)) == NOP_EXPR
 	  || TREE_CODE (TREE_OPERAND (arg0, 1)) == NOP_EXPR))
-- 
1.9.1



Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-10-29 Thread Richard Earnshaw
On 23/10/15 13:34, H.J. Lu wrote:
> On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
>  wrote:
>> Hi,
>>
>> This patch breaks the distinction between build and host. For example
>> consider a configure along these lines:
>>
>> ./configure --host=aarch64-none-linux-gnu
>> --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu
>>
>> Will result in:
>>
>> CXX_FOR_BUILD='g++'
>> CXX='aarch64-none-linux-gnu-g++'
>>
>> the gcc/configure fragment:
>>
>> +# Check if -no-pie works.
>> +AC_CACHE_CHECK([for -no-pie option],
>> +  [gcc_cv_no_pie],
>> +  [saved_LDFLAGS="$LDFLAGS"
>> +   LDFLAGS="$LDFLAGS -no-pie"
>> +   AC_LINK_IFELSE([int main(void) {return 0;}],
>> + [gcc_cv_no_pie=yes],
>> + [gcc_cv_no_pie=no])
>> +   LDFLAGS="$saved_LDFLAGS"])
>> +if test "$gcc_cv_no_pie" = "yes"; then
>> +  NO_PIE_FLAG="-no-pie"
>> +fi
>> +AC_SUBST([NO_PIE_FLAG])
>>
>> will check if CXX supports -no-pic and set NO_PIE_FLAG accordingly.
>> The gcc/Makefile.in fragment:
>>
>> @@ -761,6 +769,7 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
>>
>>  # Native linker and preprocessor flags.  For x-fragment overrides.
>>  BUILD_LDFLAGS=@BUILD_LDFLAGS@
>> +BUILD_LDFLAGS += @NO_PIE_FLAG@
>>
>> constructs the flags which will ultimately be used to compile the
>> build machines gen* programs.  That compilation uses CXX_FOR_BUILD
>> with the NO_PIE_FLAG that was originally probed for CXX.  This is
>> incorrect if CXX_FOR_BUILD != CXX
>>
>> HJ, could you take a look into this issue?
>>
> 
> Try this.
> 

Where's the ChangeLog?

R.

> 
> pie.patch
> 
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index b91b8dc..d9d2de9 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -258,11 +258,14 @@ LINKER = $(CC)
>  LINKER_FLAGS = $(CFLAGS)
>  endif
>  
> +NO_PIE_CFLAG = @NO_PIE_CFLAGS@
> +NO_PIE_FLAG = @NO_PIE_FLAG@
> +
>  # We don't want to compile the compilers with -fPIE, it make PCH fail.
> -COMPILER += @NO_PIE_CFLAGS@
> +COMPILER += $(NO_PIE_CFLAG)
>  
>  # Link with -no-pie since we compile the compiler with -fno-PIE.
> -LINKER += @NO_PIE_FLAG@
> +LINKER += $(NO_PIE_FLAG)
>  
>  # Like LINKER, but use a mutex for serializing front end links.
>  ifeq (@DO_LINK_MUTEX@,true)
> @@ -755,10 +758,13 @@ DIR = ../gcc
>  # Native compiler for the build machine and its switches.
>  CC_FOR_BUILD = @CC_FOR_BUILD@
>  CXX_FOR_BUILD = @CXX_FOR_BUILD@
> +NO_PIE_CFLAGS_FOR_BUILD = @NO_PIE_CFLAGS_FOR_BUILD@
> +NO_PIE_FLAG_FOR_BUILD = @NO_PIE_FLAG_FOR_BUILD@
>  BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE
>  BUILD_CXXFLAGS = @BUILD_CXXFLAGS@ -DGENERATOR_FILE
> -BUILD_CFLAGS += @NO_PIE_CFLAGS@
> -BUILD_CXXFLAGS += @NO_PIE_CFLAGS@
> +BUILD_NO_PIE_CFLAGS = @BUILD_NO_PIE_CFLAGS@
> +BUILD_CFLAGS += $(BUILD_NO_PIE_CFLAGS)
> +BUILD_CXXFLAGS += $(BUILD_NO_PIE_CFLAGS)
>  
>  # Native compiler that we use.  This may be C++ some day.
>  COMPILER_FOR_BUILD = $(CXX_FOR_BUILD)
> @@ -770,7 +776,8 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
>  
>  # Native linker and preprocessor flags.  For x-fragment overrides.
>  BUILD_LDFLAGS=@BUILD_LDFLAGS@
> -BUILD_LDFLAGS += @NO_PIE_FLAG@
> +BUILD_NO_PIE_FLAG = @BUILD_NO_PIE_FLAG@
> +BUILD_LDFLAGS += $(BUILD_NO_PIE_FLAG)
>  BUILD_CPPFLAGS= -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \
>   -I$(srcdir)/../include @INCINTL@ $(CPPINC) $(CPPFLAGS)
>  
> diff --git a/gcc/configure b/gcc/configure
> index 3122499..92bda6c 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -707,6 +707,10 @@ FGREP
>  SED
>  LIBTOOL
>  collect2
> +NO_PIE_FLAG_FOR_BUILD
> +NO_PIE_CFLAGS_FOR_BUILD
> +BUILD_NO_PIE_FLAG
> +BUILD_NO_PIE_CFLAGS
>  STMP_FIXINC
>  BUILD_LDFLAGS
>  BUILD_CXXFLAGS
> @@ -7096,7 +7100,8 @@ if test x$ac_checking != x ; then
>  
>  $as_echo "#define ENABLE_CHECKING 1" >>confdefs.h
>  
> -  $as_echo "#define CHECKING_P 1" >>confdefs.h
> +
> +$as_echo "#define CHECKING_P 1" >>confdefs.h
>  
>nocommon_flag=-fno-common
>  else
> @@ -12253,14 +12258,24 @@ BUILD_CXXFLAGS='$(ALL_CXXFLAGS)'
>  BUILD_LDFLAGS='$(LDFLAGS)'
>  STMP_FIXINC=stmp-fixinc
>  
> +BUILD_NO_PIE_CFLAGS='$(NO_PIE_CFLAGS)'
> +BUILD_NO_PIE_FLAG='$(NO_PIE_FLAG)'
> +
>  # And these apply if build != host, or we are generating coverage data
>  if test x$build != x$host || test "x$coverage_flags" != x
>  then
>  BUILD_CFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CFLAGS_FOR_BUILD)'
>  BUILD_CXXFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CXXFLAGS_FOR_BUILD)'
>  BUILD_LDFLAGS='$(LDFLAGS_FOR_BUILD)'
> +
> +NO_PIE_CFLAGS_FOR_BUILD=${NO_PIE_CFLAGS_FOR_BUILD-${NO_PIE_CFLAGS}}
> +NO_PIE_FLAG_FOR_BUILD=${NO_PIE_FLAG_FOR_BUILD-${NO_PIE_FLAG}}
> +BUILD_NO_PIE_CFLAGS='$(NO_PIE_CFLAGS_FOR_BUILD)'
> +BUILD_NO_PIE_FLAG='$(NO_PIE_FLAG_FOR_BUILD)'
>  fi
>  
> +
> +
>  # Expand extra_headers to include complete path.
>  # This substitutes for lots of t-* files.
>  extra_headers_list=
> @@ -18390,7 +18405,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18393 "c

[C++ Patch] PR 67845

2015-10-29 Thread Paolo Carlini

Hi,

adjusting at the same time TREE_TYPE of the already built decl, avoids 
ICEing during error recovery in merge_types for this kind of snippet. 
Tested x86_64-linux.


Thanks,
Paolo.


/cp
2015-10-29  Paolo Carlini  

PR c++/67845
* decl.c (grokfndecl): In case of erroneous cv-qualified non-member
functions consistently reset TREE_TYPE (decl) too.

/testsuite
2015-10-29  Paolo Carlini  

PR c++/67845
* g++.dg/other/cv_func4.C: New.
Index: cp/decl.c
===
--- cp/decl.c   (revision 229518)
+++ cp/decl.c   (working copy)
@@ -7998,6 +7998,11 @@ grokfndecl (tree ctype,
   DECL_EXTERNAL (decl) = 1;
   if (TREE_CODE (type) == FUNCTION_TYPE)
 {
+  if (quals || rqual)
+   TREE_TYPE (decl) = apply_memfn_quals (TREE_TYPE (decl),
+ TYPE_UNQUALIFIED,
+ REF_QUAL_NONE);
+
   if (quals)
{
  error (ctype
Index: testsuite/g++.dg/other/cv_func4.C
===
--- testsuite/g++.dg/other/cv_func4.C   (revision 0)
+++ testsuite/g++.dg/other/cv_func4.C   (working copy)
@@ -0,0 +1,6 @@
+// PR c++/67845
+
+typedef void F () const;
+
+F foo;  // { dg-error "cv-qualifier" }
+void foo ();


[gomp4.5] Support uniform parameter linear stride in declare simd

2015-10-29 Thread Jakub Jelinek
Hi!

This patch adds support for uniform linear step in addition to
constant linear step, where the step is the value passed in some
integral (or reference to integral) argument.

The vectorizer part of this isn't ready yet, but that will be just an
optimization rather than part of the ABI.

Regtested on x86_64-linux, committed to gomp-4_5-branch.

I haven't touched Cilk+, once this makes it into the trunk (next week?),
it would be nice if the Cilk+ side is changed too to accept it.

2015-10-29  Jakub Jelinek  

gcc/
* cgraph.h (enum cgraph_simd_clone_arg_type): Add
SIMD_CLONE_ARG_TYPE_LINEAR_REF_VARIABLE_STEP,
SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP and
SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP.
(struct cgraph_simd_clone_arg): Adjust comment.
* omp-low.c (simd_clone_clauses_extract): Handle variable step
for references and arguments passed by reference.
(simd_clone_mangle): Mangle ref/uval/val variable steps.
(simd_clone_adjust_argument_types): Handle
SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP like
SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP and
SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP like
SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP.
(simd_clone_linear_addend): New function.
(simd_clone_adjust): Handle variable step like similarly
to constant step, use simd_clone_linear_addend to determine
the actual step at runtime.
gcc/c-family/
* c-omp.c (c_omp_declare_simd_clauses_to_numbers): Change
OMP_CLAUSE_LINEAR_VARIABLE_STRIDE OMP_CLAUSE_LINEAR_STEP
into numbers.
(c_omp_declare_simd_clauses_to_decls): Similarly change those
from numbers to PARM_DECLs.
gcc/c/
* c-typeck.c (c_finish_omp_clauses): Diagnose if linear step
on declare simd is neither a constant nor a uniform parameter.
gcc/cp/
* parser.c (cp_parser_omp_clause_linear): Add DECLARE_SIMD argument.
Parse parameter name as linear step as id-expression rather than
expression.
(cp_parser_omp_all_clauses): Adjust caller.
* pt.c (tsubst_omp_clauses): If OMP_CLAUSE_LINEAR_VARIABLE_STRIDE,
use tsubst_omp_clause_decl instead of tsubst_expr on
OMP_CLAUSE_LINEAR_STEP.
* semantics.c (finish_omp_clauses): Diagnose if linear step
on declare simd is neither a constant nor a uniform parameter.
gcc/testsuite/
* gcc.dg/gomp/declare-simd-1.c: Add scan-assembler-times directives
for expected mangling on x86_64/i?86.
* gcc.dg/gomp/declare-simd-3.c: New test.
* gcc.dg/gomp/declare-simd-4.c: New test.
* g++.dg/gomp/declare-simd-1.C: Add dg-options.  Add
scan-assembler-times directives for expected mangling on x86_64/i?86.
* g++.dg/gomp/declare-simd-3.C: Likewise.
* g++.dg/gomp/declare-simd-4.C: New test.
* g++.dg/gomp/declare-simd-5.C: New test.

--- gcc/cgraph.h.jj 2015-10-14 10:24:55.0 +0200
+++ gcc/cgraph.h2015-10-27 10:40:08.780100715 +0100
@@ -646,11 +646,14 @@ enum cgraph_simd_clone_arg_type
   /* These are only for integer/pointer arguments passed by value.  */
   SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP,
   SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP,
-  /* These 3 are only for reference type arguments or arguments passed
+  /* These 6 are only for reference type arguments or arguments passed
  by reference.  */
   SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP,
+  SIMD_CLONE_ARG_TYPE_LINEAR_REF_VARIABLE_STEP,
   SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP,
+  SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP,
   SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP,
+  SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP,
   SIMD_CLONE_ARG_TYPE_MASK
 };
 
@@ -692,7 +695,7 @@ struct GTY(()) cgraph_simd_clone_arg {
 
   /* For arg_type SIMD_CLONE_ARG_TYPE_LINEAR_*CONSTANT_STEP this is
  the constant linear step, if arg_type is
- SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP, this is index of
+ SIMD_CLONE_ARG_TYPE_LINEAR_*VARIABLE_STEP, this is index of
  the uniform argument holding the step, otherwise 0.  */
   HOST_WIDE_INT linear_step;
 
--- gcc/omp-low.c.jj2015-10-26 15:38:20.0 +0100
+++ gcc/omp-low.c   2015-10-27 17:55:05.146748148 +0100
@@ -16318,8 +16318,29 @@ simd_clone_clauses_extract (struct cgrap
int argno = TREE_INT_CST_LOW (decl);
if (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE (t))
  {
-   clone_info->args[argno].arg_type
- = SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP;
+   enum cgraph_simd_clone_arg_type arg_type;
+   if (TREE_CODE (args[argno]) == REFERENCE_TYPE)
+ switch (OMP_CLAUSE_LINEAR_KIND (t))
+   {
+   case OMP_CLAUSE_LINEAR_REF:
+ arg_type
+   = SIMD_CLONE_ARG_TYPE_LINEAR_REF_VARIABLE_STEP;
+ 

Re: [PATCH, 1/6] Simplify constraint handling

2015-10-29 Thread Richard Biener
On Wed, 28 Oct 2015, Tom de Vries wrote:

> On 28/10/15 16:35, Richard Biener wrote:
> > On Tue, 27 Oct 2015, Tom de Vries wrote:
> > 
> > > On 27/10/15 13:24, Tom de Vries wrote:
> > > > Thinking it over a bit more, I realized the constraint handling started
> > > > to be messy. I've reworked the patch series to simplify that first.
> > > > 
> > > >1Simplify constraint handling
> > > >2Rename make_restrict_var_constraints to
> > > > make_param_constraints
> > > >3Add recursion to make_param_constraints
> > > >4Add handle_param parameter to create_variable_info_for_1
> > > >5Handle recursive restrict pointer in
> > > > create_variable_info_for_1
> > > >6Handle restrict struct fields recursively
> > > > 
> > > > Currently doing bootstrap and regtest on x86_64.
> > > > 
> > > > I'll repost the patch series in reply to this message.
> > > 
> > > This patch gets rid of this bit of code in intra_create_variable_infos:
> > > ...
> > >if (restrict_pointer_p)
> > >   make_constraint_from_global_restrict (p, "PARM_RESTRICT");
> > >else
> > > ..
> > > 
> > > I already proposed to remove it here (
> > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02426.html ) but there is a
> > > problem with that approach: It can happen that restrict_pointer_p is true,
> > > but
> > > p->only_restrict_pointers is false. This happens with fipa-pta, when
> > > create_function_info_for created a varinfo for the parameter before
> > > intra_create_variable_infos was called.
> > > 
> > > The patch handles that case now by setting p->only_restrict_pointers.
> > 
> > Hmm, but ... restrict only has an effect in non-IPA mode.
> 
> Indeed, I also realized that by now.
> 
> I wrote attached patch to make that explicit and simplify fipa-pta.
> 
> OK for trunk if bootstrap and reg-test succeeds?

I don't see the patch simplifies anything but only removes spurious
settings by adding IMHO redundant checks.

> > That we
> > use intra_create_variable_infos in IPA mode is only done for correctness
> > (to set up incoming NONLOCAL) for functions we do not see all callers of.
> > 
> 
> Right.
> 
> > Maybe we should fix that (in intra_create_variable_infos properly
> > add constraints from NONLOCAL for all such functions).
> > 
> 
> Aren't we are adding those constraints currently in
> intra_create_variable_infos? Maybe you meant 'in ipa_pta_execute'?

I meant create_function_info_for.  Include all the effects of

  /* For externally visible or attribute used annotated functions use
 local constraints for their arguments.
 For local functions we see all callers and thus do not need 
initial
 constraints for parameters.  */
  if (node->used_from_other_partition
  || node->externally_visible
  || node->force_output
  || node->address_taken)
{
  intra_create_variable_infos (func);

  /* We also need to make function return values escape.  Nothing
 escapes by returning from main though.  */
  if (!MAIN_NAME_P (DECL_NAME (node->decl)))
{
  varinfo_t fi, rvi;
  fi = lookup_vi_for_tree (node->decl);
  rvi = first_vi_for_offset (fi, fi_result);
  if (rvi && rvi->offset == fi_result)
{
  struct constraint_expr includes;
  struct constraint_expr var;
  includes.var = escaped_id;
  includes.offset = 0;
  includes.type = SCALAR;
  var.var = rvi->id;
  var.offset = 0;
  var.type = SCALAR;
  process_constraint (new_constraint (includes, var));
}
}
}

in it (just pass it a flag on whether the function is non-local).
The effect of intra_create_variable_infos above was supposed to
add a constraint from NONLOCAL on all vars (but the return value
which instead needs to escape as done above).

Richard.

> Thanks,
> - Tom
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [gomp4 04/14] nvptx: fix output of _Bool global variables

2015-10-29 Thread Bernd Schmidt

On 10/28/2015 08:29 PM, Alexander Monakov wrote:


Anything wrong with the simple fix: pick an integer type with the largest size
dividing the original struct type size?


Try it and run it through the testsuite.


Bernd



Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option

2015-10-29 Thread H.J. Lu
On Thu, Oct 29, 2015 at 4:00 AM, Richard Earnshaw
 wrote:
> On 23/10/15 13:34, H.J. Lu wrote:
>> On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
>>  wrote:
>>> Hi,
>>>
>>> This patch breaks the distinction between build and host. For example
>>> consider a configure along these lines:
>>>
>>> ./configure --host=aarch64-none-linux-gnu
>>> --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu
>>>
>>> Will result in:
>>>
>>> CXX_FOR_BUILD='g++'
>>> CXX='aarch64-none-linux-gnu-g++'
>>>
>>> the gcc/configure fragment:
>>>
>>> +# Check if -no-pie works.
>>> +AC_CACHE_CHECK([for -no-pie option],
>>> +  [gcc_cv_no_pie],
>>> +  [saved_LDFLAGS="$LDFLAGS"
>>> +   LDFLAGS="$LDFLAGS -no-pie"
>>> +   AC_LINK_IFELSE([int main(void) {return 0;}],
>>> + [gcc_cv_no_pie=yes],
>>> + [gcc_cv_no_pie=no])
>>> +   LDFLAGS="$saved_LDFLAGS"])
>>> +if test "$gcc_cv_no_pie" = "yes"; then
>>> +  NO_PIE_FLAG="-no-pie"
>>> +fi
>>> +AC_SUBST([NO_PIE_FLAG])
>>>
>>> will check if CXX supports -no-pic and set NO_PIE_FLAG accordingly.
>>> The gcc/Makefile.in fragment:
>>>
>>> @@ -761,6 +769,7 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
>>>
>>>  # Native linker and preprocessor flags.  For x-fragment overrides.
>>>  BUILD_LDFLAGS=@BUILD_LDFLAGS@
>>> +BUILD_LDFLAGS += @NO_PIE_FLAG@
>>>
>>> constructs the flags which will ultimately be used to compile the
>>> build machines gen* programs.  That compilation uses CXX_FOR_BUILD
>>> with the NO_PIE_FLAG that was originally probed for CXX.  This is
>>> incorrect if CXX_FOR_BUILD != CXX
>>>
>>> HJ, could you take a look into this issue?
>>>
>>
>> Try this.
>>
>
> Where's the ChangeLog?
>

You can follow up here:

https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02468.html

-- 
H.J.


Re: [Patch, committed] PR67933

2015-10-29 Thread Dominique d'Humières
> > The recent patches on trunk to gfc_trans_allocate have fixed PR67933.
> > I have added a testcase, which has been committed as revision 229452.
>
> This test case fails at runtime on powerpc64le-linux-gnu.  …
This should be fixed after revision r229503.
Dominique



Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 4:39 AM, Jan Hubicka  wrote:
>> > Added and comitted now.
>>
>> Thanks.  Now on to the wrong code issues. :-)
>>
>> Up to the change, the useless_type_conversion_p predicate was relying on
>> structural equivalence via the TYPE_CANONICAL check, now it only looks at the
>> outermost level (size, mode).  Now some back-ends, most notably x86-64, do a
>> deep structural scan to determine the calling conventions (classify_argument)
>> instead of just looking at the size and the mode, so consistency dictates 
>> that
>> the type of the argument and that of the parameter be structurally equivalent
>> and this sometimes can only be achieved by a VCE... which is now deleted. :-(
>> See the call to derivedIP in the attached testcase which now fails on x86-64.
>>
>> How do we get away from here?
>
> Hmm, I noticed this in ipa-icf context and wrote checker that two functions 
> are ABI
> compatile (did not pushed it out yet), but of course this is nastier.
>
> I think the problem exists before my patch with LTO - it is just matter of
> doing two types which will be considered equivalent by
> gimple_canonical_types_compatible_p but have different type conversion.  An
> example of such type would be:
>
> struct a {
>   int a[4];
> };
> struct b {
>   int a[4];
> } __attribute__ ((__aligned__(16)));
>
> I tried to turn this into an testcase, the problem is that I don't know of a 
> way
> to obtain VIEW_CONVERT_EXPR between the two types out of C or C++ frontend 
> and we
> don't seem to synthetize these in middle end (even in cases it would make 
> sense).
> I will try to play with it more - would be nice to have a C reproducer.
>
> We may be safe before my patch from wrong code issues if there is no way to
> rpduce VIEW_CONVERT_EXPR between types like this in languages that support
> aligned attribute.
>
> I think the problem is generally similar to memory references - the gimple 
> type
> compatibility should not be tied to ABI details.  Probably most consistent
> solution would be to extend GIMPLE_CALL to also list types of parameters and 
> do
> not rely on whatever type the operand have
>
> Richard, any ideas?

IMHO it was always wrong/fragile for backends to look at the actual arguments to
decide on the calling convention.  The backends should _solely_ rely on
gimple_call_fntype and its TYPE_ARG_TYPES here.

Of course then there are varargs ... (not sure if we hit this here).

But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
what exactly we gain from it (when not done on registers).

But I also don't see where we do the stripping mentioned on memory references.
The match.pd pattern doesn't apply to memory, only in the GENERIC path
which is guarded with exact type equality.  So I can't see where we end up
stripping the V_C_E.

There is one bogus case still in fold-const.c:

case VIEW_CONVERT_EXPR:
  if (TREE_CODE (op0) == MEM_REF)
/* ???  Bogus for aligned types.  */
return fold_build2_loc (loc, MEM_REF, type,
TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));

  return NULL_TREE;

that comment is only in my local tree ... (we lose alignment info that is
on the original MEM_REF type which may be a smaller one).

Richard.

> Honza
>>
>>
>>   * gnat.dg/discr44.adb: New test.
>>
>> --
>> Eric Botcazou
>
>> -- { dg-do run }
>> -- { dg-options "-gnatws" }
>>
>> procedure Discr44 is
>>
>>   function Ident (I : Integer) return Integer is
>>   begin
>> return I;
>>   end;
>>
>>   type Int is range 1 .. 10;
>>
>>   type Str is array (Int range <>) of Character;
>>
>>   type Parent (D1, D2 : Int; B : Boolean) is record
>> S : Str (D1 .. D2);
>>   end record;
>>
>>   type Derived (D : Int) is new Parent (D1 => D, D2 => D, B => False);
>>
>>   X1 : Derived (D => Int (Ident (7)));
>>
>> begin
>>   if X1.D /= 7 then
>> raise Program_Error;
>>   end if;
>> end;
>


Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 12:22 PM, Richard Biener
 wrote:
> On Thu, Oct 29, 2015 at 4:39 AM, Jan Hubicka  wrote:
>>> > Added and comitted now.
>>>
>>> Thanks.  Now on to the wrong code issues. :-)
>>>
>>> Up to the change, the useless_type_conversion_p predicate was relying on
>>> structural equivalence via the TYPE_CANONICAL check, now it only looks at 
>>> the
>>> outermost level (size, mode).  Now some back-ends, most notably x86-64, do a
>>> deep structural scan to determine the calling conventions 
>>> (classify_argument)
>>> instead of just looking at the size and the mode, so consistency dictates 
>>> that
>>> the type of the argument and that of the parameter be structurally 
>>> equivalent
>>> and this sometimes can only be achieved by a VCE... which is now deleted. 
>>> :-(
>>> See the call to derivedIP in the attached testcase which now fails on 
>>> x86-64.
>>>
>>> How do we get away from here?
>>
>> Hmm, I noticed this in ipa-icf context and wrote checker that two functions 
>> are ABI
>> compatile (did not pushed it out yet), but of course this is nastier.
>>
>> I think the problem exists before my patch with LTO - it is just matter of
>> doing two types which will be considered equivalent by
>> gimple_canonical_types_compatible_p but have different type conversion.  An
>> example of such type would be:
>>
>> struct a {
>>   int a[4];
>> };
>> struct b {
>>   int a[4];
>> } __attribute__ ((__aligned__(16)));
>>
>> I tried to turn this into an testcase, the problem is that I don't know of a 
>> way
>> to obtain VIEW_CONVERT_EXPR between the two types out of C or C++ frontend 
>> and we
>> don't seem to synthetize these in middle end (even in cases it would make 
>> sense).
>> I will try to play with it more - would be nice to have a C reproducer.
>>
>> We may be safe before my patch from wrong code issues if there is no way to
>> rpduce VIEW_CONVERT_EXPR between types like this in languages that support
>> aligned attribute.
>>
>> I think the problem is generally similar to memory references - the gimple 
>> type
>> compatibility should not be tied to ABI details.  Probably most consistent
>> solution would be to extend GIMPLE_CALL to also list types of parameters and 
>> do
>> not rely on whatever type the operand have
>>
>> Richard, any ideas?
>
> IMHO it was always wrong/fragile for backends to look at the actual arguments 
> to
> decide on the calling convention.  The backends should _solely_ rely on
> gimple_call_fntype and its TYPE_ARG_TYPES here.
>
> Of course then there are varargs ... (not sure if we hit this here).
>
> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
> what exactly we gain from it (when not done on registers).
>
> But I also don't see where we do the stripping mentioned on memory references.
> The match.pd pattern doesn't apply to memory, only in the GENERIC path
> which is guarded with exact type equality.  So I can't see where we end up
> stripping the V_C_E.
>
> There is one bogus case still in fold-const.c:
>
> case VIEW_CONVERT_EXPR:
>   if (TREE_CODE (op0) == MEM_REF)
> /* ???  Bogus for aligned types.  */
> return fold_build2_loc (loc, MEM_REF, type,
> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));
>
>   return NULL_TREE;
>
> that comment is only in my local tree ... (we lose alignment info that is
> on the original MEM_REF type which may be a smaller one).

Ah - tree_ssa_useless_type_conversion and callers, during gimplification.
I'd like to get rid of it but maybe simply delete the VIEW_CONVERT_EXPR
case from it for now (and return true unconditionally for NON_LVALUE_EXPR).

Index: gcc/tree-ssa.c
===
--- gcc/tree-ssa.c  (revision 229517)
+++ gcc/tree-ssa.c  (working copy)
@@ -1142,13 +1161,16 @@ delete_tree_ssa (struct function *fn)
 bool
 tree_ssa_useless_type_conversion (tree expr)
 {
+  /* Not strictly a conversion but this function is used to strip
+ useless stuff from trees returned from GENERIC folding.  */
+  if (TREE_CODE (expr) == NON_LVALUE_EXPR)
+return true;
+
   /* If we have an assignment that merely uses a NOP_EXPR to change
  the top of the RHS to the type of the LHS and the type conversion
  is "safe", then strip away the type conversion so that we can
  enter LHS = RHS into the const_and_copies table.  */
-  if (CONVERT_EXPR_P (expr)
-  || TREE_CODE (expr) == VIEW_CONVERT_EXPR
-  || TREE_CODE (expr) == NON_LVALUE_EXPR)
+  if (CONVERT_EXPR_P (expr))
 return useless_type_conversion_p
   (TREE_TYPE (expr),
TREE_TYPE (TREE_OPERAND (expr, 0)));

IMHO the gimplifier use should be more explicit and most remaining GIMPLE
middle-end uses should be removed (after auditing).

Richard.

> Richard.
>
>> Honza
>>>
>>>
>>>   * gnat.dg/discr44.adb: New test.
>>>
>>> --
>>> Eric Botcazou
>>
>>> -- { dg-do run }
>>> -- { dg-options "-gnatws" }
>>>
>>> 

Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 12:31 PM, Richard Biener
 wrote:
> On Thu, Oct 29, 2015 at 12:22 PM, Richard Biener
>  wrote:
>> On Thu, Oct 29, 2015 at 4:39 AM, Jan Hubicka  wrote:
 > Added and comitted now.

 Thanks.  Now on to the wrong code issues. :-)

 Up to the change, the useless_type_conversion_p predicate was relying on
 structural equivalence via the TYPE_CANONICAL check, now it only looks at 
 the
 outermost level (size, mode).  Now some back-ends, most notably x86-64, do 
 a
 deep structural scan to determine the calling conventions 
 (classify_argument)
 instead of just looking at the size and the mode, so consistency dictates 
 that
 the type of the argument and that of the parameter be structurally 
 equivalent
 and this sometimes can only be achieved by a VCE... which is now deleted. 
 :-(
 See the call to derivedIP in the attached testcase which now fails on 
 x86-64.

 How do we get away from here?
>>>
>>> Hmm, I noticed this in ipa-icf context and wrote checker that two functions 
>>> are ABI
>>> compatile (did not pushed it out yet), but of course this is nastier.
>>>
>>> I think the problem exists before my patch with LTO - it is just matter of
>>> doing two types which will be considered equivalent by
>>> gimple_canonical_types_compatible_p but have different type conversion.  An
>>> example of such type would be:
>>>
>>> struct a {
>>>   int a[4];
>>> };
>>> struct b {
>>>   int a[4];
>>> } __attribute__ ((__aligned__(16)));
>>>
>>> I tried to turn this into an testcase, the problem is that I don't know of 
>>> a way
>>> to obtain VIEW_CONVERT_EXPR between the two types out of C or C++ frontend 
>>> and we
>>> don't seem to synthetize these in middle end (even in cases it would make 
>>> sense).
>>> I will try to play with it more - would be nice to have a C reproducer.
>>>
>>> We may be safe before my patch from wrong code issues if there is no way to
>>> rpduce VIEW_CONVERT_EXPR between types like this in languages that support
>>> aligned attribute.
>>>
>>> I think the problem is generally similar to memory references - the gimple 
>>> type
>>> compatibility should not be tied to ABI details.  Probably most consistent
>>> solution would be to extend GIMPLE_CALL to also list types of parameters 
>>> and do
>>> not rely on whatever type the operand have
>>>
>>> Richard, any ideas?
>>
>> IMHO it was always wrong/fragile for backends to look at the actual 
>> arguments to
>> decide on the calling convention.  The backends should _solely_ rely on
>> gimple_call_fntype and its TYPE_ARG_TYPES here.
>>
>> Of course then there are varargs ... (not sure if we hit this here).
>>
>> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
>> what exactly we gain from it (when not done on registers).
>>
>> But I also don't see where we do the stripping mentioned on memory 
>> references.
>> The match.pd pattern doesn't apply to memory, only in the GENERIC path
>> which is guarded with exact type equality.  So I can't see where we end up
>> stripping the V_C_E.
>>
>> There is one bogus case still in fold-const.c:
>>
>> case VIEW_CONVERT_EXPR:
>>   if (TREE_CODE (op0) == MEM_REF)
>> /* ???  Bogus for aligned types.  */
>> return fold_build2_loc (loc, MEM_REF, type,
>> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 
>> 1));
>>
>>   return NULL_TREE;
>>
>> that comment is only in my local tree ... (we lose alignment info that is
>> on the original MEM_REF type which may be a smaller one).
>
> Ah - tree_ssa_useless_type_conversion and callers, during gimplification.
> I'd like to get rid of it but maybe simply delete the VIEW_CONVERT_EXPR
> case from it for now (and return true unconditionally for NON_LVALUE_EXPR).
>
> Index: gcc/tree-ssa.c
> ===
> --- gcc/tree-ssa.c  (revision 229517)
> +++ gcc/tree-ssa.c  (working copy)
> @@ -1142,13 +1161,16 @@ delete_tree_ssa (struct function *fn)
>  bool
>  tree_ssa_useless_type_conversion (tree expr)
>  {
> +  /* Not strictly a conversion but this function is used to strip
> + useless stuff from trees returned from GENERIC folding.  */
> +  if (TREE_CODE (expr) == NON_LVALUE_EXPR)
> +return true;
> +
>/* If we have an assignment that merely uses a NOP_EXPR to change
>   the top of the RHS to the type of the LHS and the type conversion
>   is "safe", then strip away the type conversion so that we can
>   enter LHS = RHS into the const_and_copies table.  */
> -  if (CONVERT_EXPR_P (expr)
> -  || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> -  || TREE_CODE (expr) == NON_LVALUE_EXPR)
> +  if (CONVERT_EXPR_P (expr))
>  return useless_type_conversion_p
>(TREE_TYPE (expr),
> TREE_TYPE (TREE_OPERAND (expr, 0)));
>
> IMHO the gimplifier use should be more explicit and most remaining GIMPLE
> middle-

Re: [gomp4 00/14] NVPTX: further porting

2015-10-29 Thread Alexander Monakov
On Thu, 29 Oct 2015, Jakub Jelinek wrote:
> > rather provide a dummy 'usleep' under #ifdef __nvptx__.  WDYT?
> 
> Such ifdefs aren't really easily possible in OpenMP right now, the
> preprocessing is done with the host compiler only, you'd need to arrange for
> usleep being defined only in the PTX path and nowhere else.

Right, I should have remembered that.  Providing it as a weak symbol doesn't
work either.  Providing it in a separate static library should work I think
(if the library goes after -lc in host linking, the linker should pick up the
dynamic symbol from libc).

What do you think about building tests for each offload target separately,
like OpenACC does?

> > I've updated my patches locally in response to reviews to the last series,
> > except where shared memory is involved.  Should I resend?
> 
> Are there any dependencies in your patch series against stuff still in
> gomp-4_0-branch rather than already on the trunk?

>From reviewing the diff I've found one dependency: code to place 'omp_data_o'
in shared memory needs pass_late_lower_omp.

Thanks.
Alexander


Re: [PATCH] Don't handle CAST_RESTRICT (PR tree-optimization/49279)

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 11:38 AM, Tom de Vries  wrote:
> [ quote-pasted from https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00464.html
> ]
>
>> CAST_RESTRICT based disambiguation unfortunately isn't reliable,
>> e.g. to store a non-restrict pointer into a restricted field,
>> we add a non-useless cast to restricted pointer in the gimplifier,
>> and while we don't consider that field to have a special restrict tag
>> because it is unsafe to do so, we unfortunately create it for the
>> CAST_RESTRICT before that and end up with different restrict tags
>> for the same thing.  See the PR for more details.
>>
>> This patch turns off CAST_RESTRICT handling for now, in the future
>> we might try to replace it by explicit CAST_RESTRICT stmts in some form,
>> but need to solve problems with multiple inlined copies of the same
>> function
>> with restrict arguments or restrict variables in it and intermixed code
>> from
>> them (or similarly code from different non-overlapping source blocks).
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>> 4.6 too?
>>
>> 2011-10-06  Jakub Jelinek  
>>
>> PR tree-optimization/49279
>> * tree-ssa-structalias.c (find_func_aliases): Don't handle
>> CAST_RESTRICT.
>> * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Allow
>> restrict propagation.
>> * tree-ssa.c (useless_type_conversion_p): Don't return false
>> if TYPE_RESTRICT differs.
>>
>> * gcc.dg/tree-ssa/restrict-4.c: XFAIL.
>> * gcc.c-torture/execute/pr49279.c: New test.
>
>
> Hi,
>
> In the patch adding support for CAST_RESTRICT (
> https://gcc.gnu.org/ml/gcc-patches/2011-10/msg00176.html ) there was also a
> bit:
> ...
> * fold-const.c (fold_unary_loc): Don't optimize
> POINTER_PLUS_EXPR casted to TYPE_RESTRICT pointer by
> casting the inner pointer if it isn't TYPE_RESTRICT.
> ...
> which is still around. I suppose we can remove this bit as well.
>
> OK for trunk if bootstrap and reg-test succeeds?

Ok.  I think the checks on TREE_OPERAND (arg0, 1) are bogus though
and either we should unconditionally sink the conversion or only
if a conversion on TREE_OPERAND (arg0, 0) vanishes (I prefer the
latter).

Richard.



> Thanks,
> - Tom
>
>


Re: PATCH: Fix cross build for CXX_FOR_BUILD != CXX

2015-10-29 Thread Richard Earnshaw
On 23/10/15 16:41, H.J. Lu wrote:
> On Fri, Oct 23, 2015 at 7:23 AM, Marcus Shawcroft
>  wrote:
>> On 23 October 2015 at 13:34, H.J. Lu  wrote:
>>> On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
>>>  wrote:
 Hi,

 This patch breaks the distinction between build and host. For example
 consider a configure along these lines:

 ./configure --host=aarch64-none-linux-gnu
 --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu
>>
 HJ, could you take a look into this issue?

>>>
>>> Try this.
>>
>> Thanks HJ. This solves the issue I was seeing.
>>
> 
> I am testing this patch.  OK for trunk if there is no regression?
> 
> 

Your ChangeLog should mention that configure has been regenerated.  And
this hunk:

@@ -7096,7 +7100,8 @@ if test x$ac_checking != x ; then

 $as_echo "#define ENABLE_CHECKING 1" >>confdefs.h

-  $as_echo "#define CHECKING_P 1" >>confdefs.h
+
+$as_echo "#define CHECKING_P 1" >>confdefs.h

   nocommon_flag=-fno-common
 else

appears to be spurious.

Otherwise OK.

R.

> 
> 0001-Add-BUILD_NO_PIE_CFLAGS-and-BUILD_NO_PIE_FLAG.patch
> 
> 
> From 7c308b52f847e10563509194964ff15b8dda8b70 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" 
> Date: Fri, 23 Oct 2015 08:14:53 -0700
> Subject: [PATCH] Add BUILD_NO_PIE_CFLAGS and BUILD_NO_PIE_FLAG
> 
> We shouldn't use NO_PIE_CFLAGS and NO_PIE_FLAG with CXX_FOR_BUILD
> when CXX_FOR_BUILD != CXX.  This patch adds BUILD_NO_PIE_CFLAGS
> and BUILD_NO_PIE_FLAG to use with CXX_FOR_BUILD.  They are set to
> NO_PIE_CFLAGS and NO_PIE_FLAG when build machine == host machine.
> Otherwise, they are set to NO_PIE_CFLAGS_FOR_BUILD and
> NO_PIE_FLAG_FOR_BUILD.
> 
>   * Makefile.in (NO_PIE_CFLAGS): New.
>   (NO_PIE_FLAG): Likewise.
>   (NO_PIE_CFLAGS_FOR_BUILD): Likewise.
>   (NO_PIE_FLAG_FOR_BUILD): Likewise.
>   (BUILD_NO_PIE_CFLAGS): Likewise.
>   (BUILD_NO_PIE_FLAG): Likewise.
>   (COMPILER): Replace @NO_PIE_CFLAGS@ with $(NO_PIE_CFLAGS).
>   (LINKER): Replace @NO_PIE_FLAG@ with $(NO_PIE_FLAG).
>   (BUILD_CFLAGS): Replace @NO_PIE_CFLAGS@ with
>   $(BUILD_NO_PIE_CFLAGS).
>   (BUILD_CXXFLAGS): Likewise.
>   (BUILD_LDFLAGS ): Replace @NO_PIE_FLAG@ with
>   $(BUILD_NO_PIE_FLAG).
>   * configure.ac (BUILD_NO_PIE_CFLAGS): New.  AC_SUBST.
>   (BUILD_NO_PIE_FLAG): Likewise.
>   (NO_PIE_CFLAGS_FOR_BUILD): Likewise.
>   (NO_PIE_FLAG_FOR_BUILD): Likewise.
> ---
>  gcc/Makefile.in  | 17 -
>  gcc/configure| 21 ++---
>  gcc/configure.ac | 10 ++
>  3 files changed, 40 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 2685b38..f2aa644 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -258,11 +258,14 @@ LINKER = $(CC)
>  LINKER_FLAGS = $(CFLAGS)
>  endif
>  
> +NO_PIE_CFLAGS = @NO_PIE_CFLAGS@
> +NO_PIE_FLAG = @NO_PIE_FLAG@
> +
>  # We don't want to compile the compilers with -fPIE, it make PCH fail.
> -COMPILER += @NO_PIE_CFLAGS@
> +COMPILER += $(NO_PIE_CFLAGS)
>  
>  # Link with -no-pie since we compile the compiler with -fno-PIE.
> -LINKER += @NO_PIE_FLAG@
> +LINKER += $(NO_PIE_FLAG)
>  
>  # Like LINKER, but use a mutex for serializing front end links.
>  ifeq (@DO_LINK_MUTEX@,true)
> @@ -755,10 +758,13 @@ DIR = ../gcc
>  # Native compiler for the build machine and its switches.
>  CC_FOR_BUILD = @CC_FOR_BUILD@
>  CXX_FOR_BUILD = @CXX_FOR_BUILD@
> +NO_PIE_CFLAGS_FOR_BUILD = @NO_PIE_CFLAGS_FOR_BUILD@
> +NO_PIE_FLAG_FOR_BUILD = @NO_PIE_FLAG_FOR_BUILD@
>  BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE
>  BUILD_CXXFLAGS = @BUILD_CXXFLAGS@ -DGENERATOR_FILE
> -BUILD_CFLAGS += @NO_PIE_CFLAGS@
> -BUILD_CXXFLAGS += @NO_PIE_CFLAGS@
> +BUILD_NO_PIE_CFLAGS = @BUILD_NO_PIE_CFLAGS@
> +BUILD_CFLAGS += $(BUILD_NO_PIE_CFLAGS)
> +BUILD_CXXFLAGS += $(BUILD_NO_PIE_CFLAGS)
>  
>  # Native compiler that we use.  This may be C++ some day.
>  COMPILER_FOR_BUILD = $(CXX_FOR_BUILD)
> @@ -770,7 +776,8 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
>  
>  # Native linker and preprocessor flags.  For x-fragment overrides.
>  BUILD_LDFLAGS=@BUILD_LDFLAGS@
> -BUILD_LDFLAGS += @NO_PIE_FLAG@
> +BUILD_NO_PIE_FLAG = @BUILD_NO_PIE_FLAG@
> +BUILD_LDFLAGS += $(BUILD_NO_PIE_FLAG)
>  BUILD_CPPFLAGS= -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \
>   -I$(srcdir)/../include @INCINTL@ $(CPPINC) $(CPPFLAGS)
>  
> diff --git a/gcc/configure b/gcc/configure
> index 3122499..92bda6c 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -707,6 +707,10 @@ FGREP
>  SED
>  LIBTOOL
>  collect2
> +NO_PIE_FLAG_FOR_BUILD
> +NO_PIE_CFLAGS_FOR_BUILD
> +BUILD_NO_PIE_FLAG
> +BUILD_NO_PIE_CFLAGS
>  STMP_FIXINC
>  BUILD_LDFLAGS
>  BUILD_CXXFLAGS
> @@ -7096,7 +7100,8 @@ if test x$ac_checking != x ; then
>  
>  $as_echo "#define ENABLE_CHECKING 1" >>confdefs.h
>  
> -  $as_echo "#define CHECKING_P 1" >>confdefs.h
> +
> +$as_echo "#define CHECKING_P 1" >>confdefs.h
>  
>nocommon_flag=-fno-common
>  else
> @@ -12253,14 +12258,24 @@ BUILD_CXXFL

Re: [PATCH, 1/6] Simplify constraint handling

2015-10-29 Thread Tom de Vries

On 29/10/15 12:13, Richard Biener wrote:

On Wed, 28 Oct 2015, Tom de Vries wrote:


>On 28/10/15 16:35, Richard Biener wrote:

> >On Tue, 27 Oct 2015, Tom de Vries wrote:
> >

> > >On 27/10/15 13:24, Tom de Vries wrote:

> > > >Thinking it over a bit more, I realized the constraint handling started
> > > >to be messy. I've reworked the patch series to simplify that first.
> > > >
> > > >1Simplify constraint handling
> > > >2Rename make_restrict_var_constraints to
> > > >make_param_constraints
> > > >3Add recursion to make_param_constraints
> > > >4Add handle_param parameter to create_variable_info_for_1
> > > >5Handle recursive restrict pointer in
> > > >create_variable_info_for_1
> > > >6Handle restrict struct fields recursively
> > > >
> > > >Currently doing bootstrap and regtest on x86_64.
> > > >
> > > >I'll repost the patch series in reply to this message.

> > >
> > >This patch gets rid of this bit of code in intra_create_variable_infos:
> > >...
> > >if (restrict_pointer_p)
> > >  make_constraint_from_global_restrict (p, "PARM_RESTRICT");
> > >else
> > >..
> > >
> > >I already proposed to remove it here (
> > >https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02426.html  ) but there is a
> > >problem with that approach: It can happen that restrict_pointer_p is true,
> > >but
> > >p->only_restrict_pointers is false. This happens with fipa-pta, when
> > >create_function_info_for created a varinfo for the parameter before
> > >intra_create_variable_infos was called.
> > >
> > >The patch handles that case now by setting p->only_restrict_pointers.

> >
> >Hmm, but ... restrict only has an effect in non-IPA mode.

>
>Indeed, I also realized that by now.
>
>I wrote attached patch to make that explicit and simplify fipa-pta.
>
>OK for trunk if bootstrap and reg-test succeeds?


First, there was an error in the patch, it tested for flag_ipa_pta (so 
it also affected ealias), but it was supposed to test for in_ipa mode. 
That is fixed in attached version.


> I don't see the patch simplifies anything but only removes spurious
> settings by adding IMHO redundant checks.

Consider testcase:
...
int __attribute__((noinline, noclone))
foo (int *__restrict__ a, int *__restrict__ b)
{
  *a = 1;
  *b = 2;
}

int __attribute__((noinline, noclone))
bar (int *a, int *b)
{
  foo (a, b);
}
...

The impact of this patch in the pta dump (focusing on the constraints 
bit) is:

...
 Generating constraints for foo (foo)

-foo.arg0 = &PARM_NOALIAS(20)
-PARM_NOALIAS(20) = NONLOCAL
-foo.arg1 = &PARM_NOALIAS(21)
-PARM_NOALIAS(21) = NONLOCAL
+foo.arg0 = &NONLOCAL
+foo.arg1 = &NONLOCAL
...

That's the kind of simplifications I'm trying to achieve.

Thanks,
- Tom

Don't interpret restrict in ipa_pta_execute

2015-10-28  Tom de Vries  

	* tree-ssa-structalias.c (push_fields_onto_fieldstack)
	(create_variable_info_for_1, create_variable_info_for)
	(intra_create_variable_infos): Ignore restrict if in_ipa_mode.
---
 gcc/tree-ssa-structalias.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 6f72dd3..66ff8cb 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5384,7 +5384,8 @@ push_fields_onto_fieldstack (tree type, vec *fieldstack,
 		e.must_have_pointers = must_have_pointers_p;
 		e.may_have_pointers = true;
 		e.only_restrict_pointers
-		  = (!has_unknown_size
+		  = (!in_ipa_mode
+		 && !has_unknown_size
 		 && POINTER_TYPE_P (field_type)
 		 && TYPE_RESTRICT (field_type));
 		fieldstack->safe_push (e);
@@ -5697,7 +5698,8 @@ create_variable_info_for_1 (tree decl, const char *name, bool add_id)
   vi->fullsize = tree_to_uhwi (declsize);
   vi->size = vi->fullsize;
   vi->is_full_var = true;
-  if (POINTER_TYPE_P (TREE_TYPE (decl))
+  if (!in_ipa_mode
+	  && POINTER_TYPE_P (TREE_TYPE (decl))
 	  && TYPE_RESTRICT (TREE_TYPE (decl)))
 	vi->only_restrict_pointers = 1;
   fieldstack.release ();
@@ -5767,9 +5769,10 @@ create_variable_info_for (tree decl, const char *name, bool add_id)
 	continue;
 
   /* Mark global restrict qualified pointers.  */
-  if ((POINTER_TYPE_P (TREE_TYPE (decl))
-	   && TYPE_RESTRICT (TREE_TYPE (decl)))
-	  || vi->only_restrict_pointers)
+  if (!in_ipa_mode
+	  && ((POINTER_TYPE_P (TREE_TYPE (decl))
+	   && TYPE_RESTRICT (TREE_TYPE (decl)))
+	  || vi->only_restrict_pointers))
 	{
 	  varinfo_t rvi
 	= make_constraint_from_global_restrict (vi, "GLOBAL_RESTRICT",
@@ -5886,7 +5889,8 @@ intra_create_variable_infos (struct function *fn)
  passed-by-reference argument.  */
   for (t = DECL_ARGUMENTS (fn->decl); t; t = DECL_CHAIN (t))
 {
-  bool restrict_pointer_p = (POINTER_TYPE_P (TREE_TYPE (t))
+  bool restrict_pointer_p = (!in_ipa_mode
+ && POINTER_TYPE_P (TREE_TYPE (t))
  && TYPE_RESTRICT (TREE_TYPE (t)));
   bool recursi

Re: PATCH: Fix cross build for CXX_FOR_BUILD != CXX

2015-10-29 Thread H.J. Lu
On Thu, Oct 29, 2015 at 5:05 AM, Richard Earnshaw
 wrote:
> On 23/10/15 16:41, H.J. Lu wrote:
>> On Fri, Oct 23, 2015 at 7:23 AM, Marcus Shawcroft
>>  wrote:
>>> On 23 October 2015 at 13:34, H.J. Lu  wrote:
 On Fri, Oct 23, 2015 at 4:54 AM, Marcus Shawcroft
  wrote:
> Hi,
>
> This patch breaks the distinction between build and host. For example
> consider a configure along these lines:
>
> ./configure --host=aarch64-none-linux-gnu
> --target=aarch64-none-linux-gnu --build=x86_64-pc-linux-gnu
>>>
> HJ, could you take a look into this issue?
>

 Try this.
>>>
>>> Thanks HJ. This solves the issue I was seeing.
>>>
>>
>> I am testing this patch.  OK for trunk if there is no regression?
>>
>>
>
> Your ChangeLog should mention that configure has been regenerated.  And
> this hunk:

I checked it in with updated ChangeLog.

> @@ -7096,7 +7100,8 @@ if test x$ac_checking != x ; then
>
>  $as_echo "#define ENABLE_CHECKING 1" >>confdefs.h
>
> -  $as_echo "#define CHECKING_P 1" >>confdefs.h
> +
> +$as_echo "#define CHECKING_P 1" >>confdefs.h
>
>nocommon_flag=-fno-common
>  else
>
> appears to be spurious.

It came from autoconf.

> Otherwise OK.
>

Thanks.

-- 
H.J.


Re: [C++ Patch] PR 67845

2015-10-29 Thread Jason Merrill

OK.

Jason


Re: [Patch, committed] PR67933

2015-10-29 Thread Bill Schmidt
On Thu, 2015-10-29 at 12:18 +0100, Dominique d'Humières wrote:
> > > The recent patches on trunk to gfc_trans_allocate have fixed PR67933.
> > > I have added a testcase, which has been committed as revision 229452.
> >
> > This test case fails at runtime on powerpc64le-linux-gnu.  …
> This should be fixed after revision r229503.
> Dominique
> 

It is indeed.  Thanks!

Bill



Re: [Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-29 Thread Ilya Enkovich
On 28 Oct 22:37, Ilya Enkovich wrote:
> Seems the problem occurs in this check in expand_vector_operations_1:
> 
>   /* A scalar operation pretending to be a vector one.  */
>   if (VECTOR_BOOLEAN_TYPE_P (type)
>   && !VECTOR_MODE_P (TYPE_MODE (type))
>   && TYPE_MODE (type) != BLKmode)
> return;
> 
> This is to filter out scalar operations on boolean vectors.
> The problem here is that TYPE_MODE (type) doesn't return
> V4SImode assigned to the type but calls vector_type_mode
> instead which tries to find an integer mode for it and returns
> TImode. This causes function exit and we don't expand vector
> comparison.
> 
> Suppose simple option to fix it is to change default get_mask_mode
> hook to return BLKmode in case chosen integer vector mode is not
> vector_mode_supported_p.
> 
> Thanks,
> Ilya
> 

Here is a patch which fixes the problem on ARM (and on i386 with -mno-sse 
also).  I checked it fixes the problem on ARM and also bootstrapped and checked 
it on x86_64-unknown-linux-gnu.  Is it OK?

Thanks,
Ilya
--
gcc/

2015-10-29  Ilya Enkovich  

* targhooks.c (default_get_mask_mode): Use BLKmode in
case target doesn't support required vector mode.
* stor-layout.c (layout_type): Check for BLKmode.


diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 58ecd7b..ae7d6fb 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2185,7 +2185,8 @@ layout_type (tree type)
TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
 TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
/* Several boolean vector elements may fit in a single unit.  */
-   if (VECTOR_BOOLEAN_TYPE_P (type))
+   if (VECTOR_BOOLEAN_TYPE_P (type)
+   && type->type_common.mode != BLKmode)
  TYPE_SIZE_UNIT (type)
= size_int (GET_MODE_SIZE (type->type_common.mode));
else
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index c39f266..d378864 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1095,10 +1095,16 @@ default_get_mask_mode (unsigned nunits, unsigned 
vector_size)
   unsigned elem_size = vector_size / nunits;
   machine_mode elem_mode
 = smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
+  machine_mode vector_mode;
 
   gcc_assert (elem_size * nunits == vector_size);
 
-  return mode_for_vector (elem_mode, nunits);
+  vector_mode = mode_for_vector (elem_mode, nunits);
+  if (VECTOR_MODE_P (vector_mode)
+  && !targetm.vector_mode_supported_p (vector_mode))
+vector_mode = BLKmode;
+
+  return vector_mode;
 }
 
 /* By default, the cost model accumulates three separate costs (prologue,


Re: [PATCH, 1/6] Simplify constraint handling

2015-10-29 Thread Richard Biener
On Thu, 29 Oct 2015, Tom de Vries wrote:

> On 29/10/15 12:13, Richard Biener wrote:
> > On Wed, 28 Oct 2015, Tom de Vries wrote:
> > 
> > > >On 28/10/15 16:35, Richard Biener wrote:
> > > > > >On Tue, 27 Oct 2015, Tom de Vries wrote:
> > > > > >
> > > > > > > >On 27/10/15 13:24, Tom de Vries wrote:
> > > > > > > > > >Thinking it over a bit more, I realized the constraint
> > > > > > handling started
> > > > > > > > > >to be messy. I've reworked the patch series to simplify that
> > > > > > first.
> > > > > > > > > >
> > > > > > > > > >1Simplify constraint handling
> > > > > > > > > >2Rename make_restrict_var_constraints to
> > > > > > > > > >make_param_constraints
> > > > > > > > > >3Add recursion to make_param_constraints
> > > > > > > > > >4Add handle_param parameter to
> > > > > > create_variable_info_for_1
> > > > > > > > > >5Handle recursive restrict pointer in
> > > > > > > > > >create_variable_info_for_1
> > > > > > > > > >6Handle restrict struct fields recursively
> > > > > > > > > >
> > > > > > > > > >Currently doing bootstrap and regtest on x86_64.
> > > > > > > > > >
> > > > > > > > > >I'll repost the patch series in reply to this message.
> > > > > > > >
> > > > > > > >This patch gets rid of this bit of code in
> > > > > intra_create_variable_infos:
> > > > > > > >...
> > > > > > > >if (restrict_pointer_p)
> > > > > > > > make_constraint_from_global_restrict (p, "PARM_RESTRICT");
> > > > > > > >else
> > > > > > > >..
> > > > > > > >
> > > > > > > >I already proposed to remove it here (
> > > > > > > >https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02426.html  ) but
> > > > > there is a
> > > > > > > >problem with that approach: It can happen that restrict_pointer_p
> > > > > is true,
> > > > > > > >but
> > > > > > > >p->only_restrict_pointers is false. This happens with fipa-pta,
> > > > > when
> > > > > > > >create_function_info_for created a varinfo for the parameter
> > > > > before
> > > > > > > >intra_create_variable_infos was called.
> > > > > > > >
> > > > > > > >The patch handles that case now by setting
> > > > > p->only_restrict_pointers.
> > > > > >
> > > > > >Hmm, but ... restrict only has an effect in non-IPA mode.
> > > >
> > > >Indeed, I also realized that by now.
> > > >
> > > >I wrote attached patch to make that explicit and simplify fipa-pta.
> > > >
> > > >OK for trunk if bootstrap and reg-test succeeds?
> 
> First, there was an error in the patch, it tested for flag_ipa_pta (so it also
> affected ealias), but it was supposed to test for in_ipa mode. That is fixed
> in attached version.
> 
> > I don't see the patch simplifies anything but only removes spurious
> > settings by adding IMHO redundant checks.
> 
> Consider testcase:
> ...
> int __attribute__((noinline, noclone))
> foo (int *__restrict__ a, int *__restrict__ b)
> {
>   *a = 1;
>   *b = 2;
> }
> 
> int __attribute__((noinline, noclone))
> bar (int *a, int *b)
> {
>   foo (a, b);
> }
> ...
> 
> The impact of this patch in the pta dump (focusing on the constraints bit) is:
> ...
>  Generating constraints for foo (foo)
> 
> -foo.arg0 = &PARM_NOALIAS(20)
> -PARM_NOALIAS(20) = NONLOCAL
> -foo.arg1 = &PARM_NOALIAS(21)
> -PARM_NOALIAS(21) = NONLOCAL
> +foo.arg0 = &NONLOCAL
> +foo.arg1 = &NONLOCAL
> ...
> 
> That's the kind of simplifications I'm trying to achieve.

Yes, but as I said we should refactor things to avoid calling
the intra constraints generation from the IPA path.

Richard.

> Thanks,
> - Tom
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška  wrote:
> On 10/28/2015 04:23 PM, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
>>> On 10/27/2015 03:49 PM, Richard Biener wrote:
 On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
> On 10/26/2015 02:48 PM, Richard Biener wrote:
>> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
>>> On 10/21/2015 04:06 PM, Richard Biener wrote:
 On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
> On 10/21/2015 11:59 AM, Richard Biener wrote:
>> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  
>> wrote:
>>> On 10/20/2015 03:39 PM, Richard Biener wrote:
 On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
 wrote:
> Hello.
>
> As part of upcoming merge of HSA branch, we would like to have 
> possibility to terminate
> pass manager after execution of the HSA generation pass. The HSA 
> back-end is implemented
> as a tree pass that directly emits HSAIL from gimple tree 
> representation. The pass operates
> on clones created by HSA IPA pass and the pass manager should 
> stop execution of further
> RTL passes.
>
> Suggested patch survives bootstrap and regression tests on 
> x86_64-linux-pc.
>
> What do you think about it?

 Are you sure it works this way?

 Btw, you will miss executing of all the cleanup passes that will
 eventually free memory
 associated with the function.  So I'd rather support a
 TODO_discard_function which
 should basically release the body from the cgraph.
>>>
>>> Hi.
>>>
>>> Agree with you that I should execute all TODOs, which can be easily 
>>> done.
>>> However, if I just try to introduce the suggested TODO and handle 
>>> it properly
>>> by calling cgraph_node::release_body, then for instance 
>>> fn->gimple_df, fn->cfg are
>>> released and I hit ICEs on many places.
>>>
>>> Stopping the pass manager looks necessary, or do I miss something?
>>
>> "Stopping the pass manager" is necessary after 
>> TODO_discard_function, yes.
>> But that may be simply done via a has_body () check then?
>
> Thanks, there's second version of the patch. I'm going to start 
> regression tests.

 As release_body () will free cfun you should pop_cfun () before
 calling it (and thus
>>>
>>> Well, release_function_body calls both push & pop, so I think calling 
>>> pop
>>> before cgraph_node::release_body is not necessary.
>>
>> (ugh).
>>
>>> If tried to call pop_cfun before cgraph_node::release_body, I have cfun 
>>> still
>>> pointing to the same (function *) (which is gcc_freed, but cfun != 
>>> NULL).
>>
>> Hmm, I meant to call pop_cfun then after it (unless you want to fix the 
>> above,
>> none of the freeing functions should techincally need 'cfun', just add
>> 'fn' parameters ...).
>>
>> I expected pop_cfun to eventually set cfun to NULL if it popped the
>> "last" cfun.  Why
>> doesn't it do that?
>>
 drop its modification).  Also TODO_discard_functiuon should be only 
 set for
 local passes thus you should probably add a gcc_assert (cfun).
 I'd move its handling earlier, definitely before the ggc_collect, 
 eventually
 before the pass_fini_dump_file () (do we want a last dump of the
 function or not?).
>>>
>>> Fully agree, moved here.
>>>

 @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
  {
gcc_assert (pass->type == GIMPLE_PASS
   || pass->type == RTL_PASS);
 +
 +
 +  if (!gimple_has_body_p (current_function_decl))
 +   return;

 too much vertical space.  With popping cfun before releasing the body 
 the check
 might just become if (!cfun) and
>>>
>>> As mentioned above, as release body is symmetric (calling push & pop), 
>>> the suggested
>>> guard will not work.
>>
>> I suggest to fix it.  If it calls push/pop it should leave with the
>> original cfun, popping again
>> should make it NULL?
>>

 @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass)
  {
push_cfun (fn);
execute_pass_list_1 (pass);
 -  if (fn->cfg)
 +  if (gimple_has_body_p (current_function_decl) && fn->cfg)
  {
free_dominance_info (CDI_DOMINATORS);
free_dominance_info (CD

Re: [PATCH v2] [PR debug/67192] Fix C loops' back-jump location

2015-10-29 Thread Andreas Arnez
Ping:

  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02393.html

On Fri, Oct 23 2015, Andreas Arnez wrote:

> After parsing an unconditional "while"- or "for"-loop, the C front-end
> generates a backward-goto statement and implicitly sets its location to
> the current input_location.  But in some cases the parser peeks ahead
> first, such that input_location already points to the line after the
> loop and the generated backward-goto gets the wrong line number.
>
> One way this can occur is with a loop body consisting of an "if"
> statement, because then the parser peeks for an optional "else" before
> finishing the loop.
>
> Another way occurs since r223098 ("Implement -Wmisleading-indentation"),
> even with a loop body enclosed in braces.  This is because the check for
> misleading indentation always peeks ahead one token as well.
>
> This patch adds a new parameter to c_finish_loop that expclitly
> specifies the location to be used for the loop iteration.  All calls to
> c_finish_loop are adjusted accordingly.
>
> gcc/c/ChangeLog:
>
>   PR debug/67192
>   * c-typeck.c (c_finish_loop): Replace implicit use of
>   input_location by new parameter iter_locus.
>   * c-tree.h (c_finish_loop): Adjust prototype.
>   * c-array-notation.c (fix_builtin_array_notation_fn): Explicitly
>   pass input_location as the new parameter to c_finish_loop.
>   (build_array_notation_expr): Likewise.
>   (fix_conditional_array_notations_1): Likewise.
>   (fix_array_notation_expr): Likewise.
>   (fix_array_notation_call_expr): Likewise.
>   * c-parser.c (c_parser_while_statement): Choose iter_locus
>   depending on whether the loop body is enclosed in braces, and pass
>   it to c_finish_loop.
>   (c_parser_for_statement): Likewise.
>   (c_parser_do_statement): Use the final semicolon's location for
>   iter_locus and pass it to c_finish_loop.
>
> gcc/testsuite/ChangeLog:
>
>   PR debug/67192
>   * gcc.dg/guality/pr67192.c: New test.



[Patch] [x86_64] libgcc changes to add znver1

2015-10-29 Thread Kumar, Venkataramanan
Hi Uros,

As per your comments in 
https://gcc.gnu.org/ml/gcc-patches/2015-09/msg02326.html  please find the patch 
that also adds changes to libgcc.

It was bootstrapped and regressed tested on x86_64. 

Ok for trunk?

Change logs 
gcc/ChangeLog

2015-10-29  Venkataramanan Kumar  

   * config/i386/i386.c (get_builtin_code_for_version): Set priority for
   PROCESSOR_ZNVER1.
   (enum processor_model): Add M_AMDFAM17H_znver1.
   (struct arch_names_table): Likewise.
   * doc/extend.texi: ADD znver1.

libgcc/ChangeLog
2015-10-12  Venkataramanan kumar  

   * config/i386/cpuinfo.c (enum processor_types): Add AMDFAM17H.
   (processor_subtypes): Add znver1.
   (get_amd_cpu): Detect znver1.


Regards,
Venkat.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 82fd054..a52b313 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -35898,6 +35898,10 @@ get_builtin_code_for_version (tree decl, tree 
*predicate_list)
  arg_str = "bdver4";
  priority = P_PROC_AVX2;
  break;
+   case PROCESSOR_ZNVER1:
+ arg_str = "znver1";
+ priority = P_PROC_AVX2;
+ break;
}
}
 
@@ -36808,6 +36812,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_AMDFAM15H_BDVER2,
 M_AMDFAM15H_BDVER3,
 M_AMDFAM15H_BDVER4,
+M_AMDFAM17H_ZNVER1,
 M_INTEL_COREI7_IVYBRIDGE,
 M_INTEL_COREI7_HASWELL,
 M_INTEL_COREI7_BROADWELL,
@@ -36850,6 +36855,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
   {"bdver3", M_AMDFAM15H_BDVER3},
   {"bdver4", M_AMDFAM15H_BDVER4},
   {"btver2", M_AMD_BTVER2},
+  {"znver1", M_AMDFAM17H_ZNVER1},
 };
 
   static struct _isa_names_table
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index fdb1547..7a05c27 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -16995,6 +16995,9 @@ AMD Family 15h Bulldozer version 4.
 
 @item btver2
 AMD Family 16h CPU.
+
+@item znver1
+AMD Family 17h CPU.
 @end table
 
 Here is an example:
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 1313ca3..b0ebfcf 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -59,6 +59,7 @@ enum processor_types
   INTEL_KNL,
   AMD_BTVER1,
   AMD_BTVER2,  
+  AMDFAM17H,
   CPU_TYPE_MAX
 };
 
@@ -74,6 +75,7 @@ enum processor_subtypes
   AMDFAM15H_BDVER2,
   AMDFAM15H_BDVER3,
   AMDFAM15H_BDVER4,
+  AMDFAM17H_ZNVER1,
   INTEL_COREI7_IVYBRIDGE,
   INTEL_COREI7_HASWELL,
   INTEL_COREI7_BROADWELL,
@@ -177,6 +179,12 @@ get_amd_cpu (unsigned int family, unsigned int model)
 case 0x16:
   __cpu_model.__cpu_type = AMD_BTVER2;
   break;
+case 0x17:
+  __cpu_model.__cpu_type = AMDFAM17H;
+  /* AMD family 17h version 1.  */
+  if (model <= 0x1f)
+   __cpu_model.__cpu_subtype = AMDFAM17H_ZNVER1;
+  break;
 default:
   break;
 }


Re: [PATCH] New attribute to create target clones

2015-10-29 Thread Evgeny Stupachenko
On Mon, Oct 26, 2015 at 6:50 PM, Jeff Law  wrote:
> On 10/12/2015 05:35 PM, Evgeny Stupachenko wrote:
>>
>> Hi All,
>>
>> Here is a new version of patch (attached).
>> Bootstrap and make check are in progress (all new tests passed).
>>
>> New test case g++.dg/ext/mvc4.C fails with ICE, when options lower
>> than "-mavx" are passed.
>> However it has the same behavior if "target_clones" attribute is
>> replaced by 2 corresponding "target" attributes.
>> I've filed PR67946 on this:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67946
>>
>> Thanks,
>> Evgeny
>>
>> ChangeLog:
>>
>> 2015-10-13  Evgeny Stupachenko
>> gcc/
>>  * Makefile.in (OBJS): Add multiple_target.o.
>>  * attrib.c (make_attribute): Moved from config/i386/i386.c
>>  * config/i386/i386.c (make_attribute): Deleted.
>>  * multiple_target.c (make_attribute): New.
>>  (create_dispatcher_calls): Ditto.
>>  (get_attr_len): Ditto.
>>  (get_attr_str): Ditto.
>>  (is_valid_asm_symbol): Ditto.
>>  (create_new_asm_name): Ditto.
>>  (create_target_clone): Ditto.
>>  (expand_target_clones): Ditto.
>>  (ipa_target_clone): Ditto.
>>  (ipa_dispatcher_calls): Ditto.
>>  * passes.def (pass_target_clone): Two new ipa passes.
>>  * tree-pass.h (make_pass_target_clone): Ditto.
>>
>> gcc/c-family
>>  * c-common.c (handle_target_clones_attribute): New.
>>  * (c_common_attribute_table): Add handle_target_clones_attribute.
>>  * (handle_always_inline_attribute): Add check on target_clones
>>  attribute.
>>  * (handle_target_attribute): Ditto.
>>
>> gcc/testsuite
>>  * gcc.dg/mvc1.c: New test for multiple targets cloning.
>>  * gcc.dg/mvc2.c: Ditto.
>>  * gcc.dg/mvc3.c: Ditto.
>>  * gcc.dg/mvc4.c: Ditto.
>>  * gcc.dg/mvc5.c: Ditto.
>>  * gcc.dg/mvc6.c: Ditto.
>>  * gcc.dg/mvc7.c: Ditto.
>>  * g++.dg/ext/mvc1.C: Ditto.
>>  * g++.dg/ext/mvc2.C: Ditto.
>>  * g++.dg/ext/mvc3.C: Ditto.
>>  * g++.dg/ext/mvc4.C: Ditto.
>>
>> gcc/doc
>>  * doc/extend.texi (target_clones): New attribute description.
>>
>
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 23e6a76..f9d28d1 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -3066,6 +3066,19 @@ This function attribute make a stack protection of
>> the function if
>>   flags @option{fstack-protector} or @option{fstack-protector-strong}
>>   or @option{fstack-protector-explicit} are set.
>>
>> +@item target_clones (@var{options})
>> +@cindex @code{target_clones} function attribute
>> +The @code{target_clones} attribute is used to specify that a function is
>> to
>> +be cloned into multiple versions compiled with different target options
>> +than specified on the command line.  The supported options and
>> restrictions
>> +are the same as for @code{target}.
>
> "as for @code{target}" -> "as for the @code{target} attribute."
>
> I think that makes it a tiny bit clearer.
>
>
>
>
>> +
>> +/*  Creates target clone of NODE.  */
>> +
>> +static cgraph_node *
>> +create_target_clone (cgraph_node *node, bool definition)
>> +{
>> +  cgraph_node *new_node;
>> +  if (definition)
>> +{
>> +  new_node = node->create_version_clone_with_body (vNULL, NULL,
>> +  NULL, false,
>> +  NULL, NULL,
>> +  "target_clone");
>> +  new_node->externally_visible = node->externally_visible;
>> +  new_node->address_taken = node->address_taken;
>> +  new_node->thunk = node->thunk;
>> +  new_node->alias = node->alias;
>> +  new_node->weakref = node->weakref;
>> +  new_node->cpp_implicit_alias = node->cpp_implicit_alias;
>> +  new_node->local.local = node->local.local;
>
> So do we need to explicitly clear TREE_PUBLIC here?  It also seems like
> copying externally_visible, address_taken and possibly some of those other
> fields is wrong.  The clone is going to be local to the CU, it doesn't
> inherit those properties from the original -- only the dispatcher needs to
> inherit those properties, right?
Right. That was just the way to keep the node, that I didn't clean up.
Replaced with "new_node->force_output = true;"

>
>> +
>> +
>> +  for (i = 0; i < attrnum; i++)
>> +{
>> +  char *attr = attrs[i];
>> +  cgraph_node *new_node = create_target_clone (node, defenition);
>> +  char *new_asm_name =
>> + XNEWVEC (char, strlen (old_asm_name) + strlen (attr) + 2);
>> +  create_new_asm_name (old_asm_name, attr, new_asm_name);
>
> I thought after discussions with Jan that this wasn't going to be necessary
> as cloning should create a suitable name for us?
Yes. This is not necessary. However that way we'll have the following
code in dispatcher:
cmpl$6, __cpu_model

[PATCH] Fix c-c++-common/torture/vector-compare-1.c on powerpc

2015-10-29 Thread Ilya Enkovich
Hi,

This patches powerpc fails for c-c++-common/torture/vector-compare-1.c test.  
The problem is that vector comparison lowering produces vector of 0s and 1s 
instead of 0s and -1s.  It doesn't matter if it usage is also lowered (like 
happens on ARM and i386 with -mno-sse) byt on powerpc we may have comparison of 
doubles be lowered but following VEC_COND_EXPR not lowered.  It causes wrong 
VEC_COND_EXPR result.  i checked this patch fixes the test.  Full regression 
testing on powerpc64le-unknown-linux-gnu is in progress.  OK if no regression?

Thanks,
Ilya
--
gcc/

2015-10-29  Ilya Enkovich  

* tree-vect-generic.c (do_compare): Use -1 for true
result instead of 1.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index d0a4e0f..0b60b15 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -161,10 +161,27 @@ static tree
 do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
tree bitpos, tree bitsize, enum tree_code code, tree type)
 {
+  tree stype = TREE_TYPE (type);
+
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
 
-  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
+  if (TYPE_PRECISION (stype) > 1)
+{
+  tree cst_false = build_zero_cst (stype);
+  tree cst_true;
+  tree cmp;
+
+  if (TYPE_UNSIGNED (stype))
+   cst_true = TYPE_MAXVAL (stype);
+  else
+   cst_true = build_minus_one_cst (stype);
+
+  cmp = build2 (code, boolean_type_node, a, b);
+  return gimplify_build3 (gsi, COND_EXPR, stype, cmp, cst_true, cst_false);
+}
+
+  return gimplify_build2 (gsi, code, stype, a, b);
 }
 
 /* Expand vector addition to scalars.  This does bit twiddling


Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Marcus Shawcroft
On 28 October 2015 at 10:07, Kyrill Tkachov  wrote:
> Hi all,
>
> This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding
> when processing an msubsi insn
> with subregs:
> (insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
> (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
> (mult:SI (subreg:SI (reg:DI 83) 0)
> (subreg:SI (reg:DI 75 [ _20 ]) 0 schedice.c:10 357
> {*msubsi}
>
> The register_operand predicate for that pattern allows subregs (I think
> correctly).
> The code in aarch_accumulator_forwarding doesn't take that into account and
> ends up
> taking a REGNO of a SUBREG, causing a checking error.
>
> This patch fixes that by stripping the subregs off the accumulator rtx
> before
> checking that the inner expression is a REG and taking its REGNO.
>
> The testcase now works fine with an aarch64-none-elf toolchain configure for
> RTL checking.
>
> The testcase is taken verbatim from the BZ entry for PR 68088.
> Since this function is shared between arm and aarch64 I've bootstrapped and
> tested it on both
> and I'll need ok's for both ports.
>
> Ok for trunk?

rtl.h exposes reg_or_subregno() already doesn't that do what we need here?

The test case is not aarch64 specific therefore I think convention is
that it should go into a generic directory.

Cheers
/Marcus


Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Kyrill Tkachov

Hi Marcus,

On 29/10/15 13:46, Marcus Shawcroft wrote:

On 28 October 2015 at 10:07, Kyrill Tkachov  wrote:

Hi all,

This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding
when processing an msubsi insn
with subregs:
(insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
 (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
 (mult:SI (subreg:SI (reg:DI 83) 0)
 (subreg:SI (reg:DI 75 [ _20 ]) 0 schedice.c:10 357
{*msubsi}

The register_operand predicate for that pattern allows subregs (I think
correctly).
The code in aarch_accumulator_forwarding doesn't take that into account and
ends up
taking a REGNO of a SUBREG, causing a checking error.

This patch fixes that by stripping the subregs off the accumulator rtx
before
checking that the inner expression is a REG and taking its REGNO.

The testcase now works fine with an aarch64-none-elf toolchain configure for
RTL checking.

The testcase is taken verbatim from the BZ entry for PR 68088.
Since this function is shared between arm and aarch64 I've bootstrapped and
tested it on both
and I'll need ok's for both ports.

Ok for trunk?

rtl.h exposes reg_or_subregno() already doesn't that do what we need here?


reg_or_subregno assumes that what it's passed is REG or a SUBREG.
It will ICE on any other rtx. Here I want to strip the subreg if it is
a subreg, but leave it as it is otherwise.



The test case is not aarch64 specific therefore I think convention is
that it should go into a generic directory.


Ok, I'll put it in gcc.dg/

Thanks,
Kyrill



Cheers
/Marcus





Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Marcus Shawcroft
On 29 October 2015 at 13:50, Kyrill Tkachov  wrote:

>>> Ok for trunk?
>>
>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>
>
> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
> It will ICE on any other rtx. Here I want to strip the subreg if it is
> a subreg, but leave it as it is otherwise.

OK, I follow.

>> The test case is not aarch64 specific therefore I think convention is
>> that it should go into a generic directory.
>
>
> Ok, I'll put it in gcc.dg/


OK with the test case moved. Thanks /Marcus


Re: [PATCH], PowerPC IEEE 128-bit patch #9 (enable __float128 by default on VSX systems)

2015-10-29 Thread Segher Boessenkool
Hi!

On Tue, Oct 27, 2015 at 06:31:41PM -0400, Michael Meissner wrote:
> Index: gcc/testsuite/gcc.target/powerpc/float128-1.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/float128-1.c (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/float128-1.c (revision 0)
> @@ -0,0 +1,149 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
> +/* { dg-skip-if "" { powerpc*-*-*spe* } { "*" } { "" } } */

Some of this is redundant.  Should this really be linux-only?
The second line doesn't do anything then.  { "*" } { "" } isn't
needed either.

In general, please only add dg-skip-if if you know it is needed.

There probably should be an effective-target for float128.


Segher


[gomp4] revert num_gangs, num_workers, vector_length and num_threads parser changes in c/c++

2015-10-29 Thread Cesar Philippidis
In gomp-4_0-branch, we've tried to consolidate the parsing all of the
clauses of the form

  foo (int-expression)

into a single c*_parser_omp_positive_int_clause function. At the time,
such clauses included num_gangs, num_workers, vector_length and
num_threads. Looking at OpenMP 4.5, there are additional candidates for
this function, specifically num_tasks, grainsize, priority and hint.
With that in mind, parser support for all of the aforementioned clauses
is already present in trunk, so I'll revert these change in
gomp-4_0-branch since they add no functionality. We might revisit a
similar patch if OpenACC adds new clauses of this form in the future.

I've applied this patch to gomp-4_0-branch.

Cesar


[PATCH] Fix PR68142

2015-10-29 Thread Richard Biener

This avoids simplifying 2 * (a * (__INT_MAX__/2 + 1))
to a * INT_MIN which introduces undefined overflow for a == -1.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-10-29  Richard Biener  

PR middle-end/68142
* fold-const.c (extract_muldiv_1): Avoid introducing undefined
overflow.

* c-c++-common/ubsan/pr68142.c: New testcase.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 229517)
--- gcc/fold-const.c(working copy)
*** extract_muldiv_1 (tree t, tree c, enum t
*** 6008,6015 
 or (for divide and modulus) if it is a multiple of our constant.  */
if (code == MULT_EXPR
  || wi::multiple_of_p (t, c, TYPE_SIGN (type)))
!   return const_binop (code, fold_convert (ctype, t),
!   fold_convert (ctype, c));
break;
  
  CASE_CONVERT: case NON_LVALUE_EXPR:
--- 6015,6031 
 or (for divide and modulus) if it is a multiple of our constant.  */
if (code == MULT_EXPR
  || wi::multiple_of_p (t, c, TYPE_SIGN (type)))
!   {
! tree tem = const_binop (code, fold_convert (ctype, t),
! fold_convert (ctype, c));
! /* If the multiplication overflowed to INT_MIN then we lost sign
!information on it and a subsequent multiplication might
!spuriously overflow.  See PR68142.  */
! if (TREE_OVERFLOW (tem)
! && wi::eq_p (tem, wi::min_value (TYPE_PRECISION (ctype), SIGNED)))
!   return NULL_TREE;
! return tem;
!   }
break;
  
  CASE_CONVERT: case NON_LVALUE_EXPR:
Index: gcc/testsuite/c-c++-common/ubsan/pr68142.c
===
*** gcc/testsuite/c-c++-common/ubsan/pr68142.c  (revision 0)
--- gcc/testsuite/c-c++-common/ubsan/pr68142.c  (working copy)
***
*** 0 
--- 1,31 
+ /* { dg-do run } */
+ /* { dg-options "-fsanitize=undefined -fsanitize-undefined-trap-on-error" } */
+ 
+ int __attribute__((noinline,noclone))
+ h(int a)
+ {
+   return 2 * (a * (__INT_MAX__/2 + 1));
+ }
+ int __attribute__((noinline,noclone))
+ i(int a)
+ {
+   return (2 * a) * (__INT_MAX__/2 + 1);
+ }
+ int __attribute__((noinline,noclone))
+ j(int a, int b)
+ {
+   return (b * a) * (__INT_MAX__/2 + 1);
+ }
+ int __attribute__((noinline,noclone))
+ k(int a, int b)
+ {
+   return (2 * a) * b;
+ }
+ int main()
+ {
+   volatile int tem = h(-1);
+   tem = i(-1);
+   tem = j(-1, 2);
+   tem = k(-1, __INT_MAX__/2 + 1);
+   return 0;
+ }


Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

2015-10-29 Thread Kyrill Tkachov


On 29/10/15 14:00, Marcus Shawcroft wrote:

On 29 October 2015 at 13:50, Kyrill Tkachov  wrote:


Ok for trunk?

rtl.h exposes reg_or_subregno() already doesn't that do what we need here?


reg_or_subregno assumes that what it's passed is REG or a SUBREG.
It will ICE on any other rtx. Here I want to strip the subreg if it is
a subreg, but leave it as it is otherwise.

OK, I follow.


The test case is not aarch64 specific therefore I think convention is
that it should go into a generic directory.


Ok, I'll put it in gcc.dg/


OK with the test case moved. Thanks /Marcus


Thanks, but I'd like to do a slight respin.
The testcase is moved to gcc.dg but I also avoid creating the new
helper function and just do the SUBREG extraction once at the very end.
This makes the patch smaller.

Since you're ok with the approach and this revision is logically equivalent,
I just need an ok from an arm perspective.

Thanks,
Kyrill

2015-10-29  Kyrylo Tkachov  

PR target/68088
* config/arm/aarch-common.c (aarch_accumulator_forwarding): Strip
subregs from accumulator and make sure it's a register.

2015-10-29  Kyrylo Tkachov  

PR target/68088
* gcc.dg/pr68088_1.c: New test.

commit aa4df340968b330544edbbb8ea706f2d56011381
Author: Kyrylo Tkachov 
Date:   Tue Oct 27 11:42:19 2015 +

[ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index a940a02..e6668d5 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -460,6 +460,12 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
 	return 0;
 }
 
+  if (GET_CODE (accumulator) == SUBREG)
+accumulator = SUBREG_REG (accumulator);
+
+  if (!REG_P (accumulator))
+return 0;
+
   return (REGNO (dest) == REGNO (accumulator));
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr68088_1.c b/gcc/testsuite/gcc.dg/pr68088_1.c
new file mode 100644
index 000..49c6aa1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68088_1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void bar (unsigned long);
+
+void
+foo (unsigned long aul, unsigned m, unsigned i)
+{
+  while (1)
+{
+  aul += i;
+  i = aul % m;
+  bar (aul);
+}
+}


Re: [ARM] libgcc: include crtfastmath.o

2015-10-29 Thread Christophe Lyon
On 29 October 2015 at 11:07, Ramana Radhakrishnan
 wrote:
> On Thu, Oct 29, 2015 at 9:25 AM, Christophe Lyon
>  wrote:
>> Hi,
>>
>> On arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets,
>> crtfastmath.o is not part of $extra_parts, which means that it is not
>> built, even though the make build rule is present via the inclusion of
>> t-crtfm.
>>
>> The impact is that the link fails when using -ffast-math.
>>
>> The attached patch adds crtfastmath.o to $extra_parts, for
>> arm*-*-eabi*, arm*-*-symbianelf*, arm*-*-rtems* targets.
>>
>> I tested on arm-none-eabi, and I am not sure if this is correct for
>> symbian/rtems.
>>
>> OK? Or should I restrict the change to arm*-*-eabi* ?
>
>
> I don't see how they are inappropriate on those targets as
> crtfastmath.c only set flush to zero in with -ffast-math when the
> target architecture has floating point hardware and are only brought
> in if the link spec asks for it (which it does only with -ffast-math)
>
> Ok if no regressions.
Now committed.
For the record, I see +2500 PASS on arm-none-eabi.

Christophe.


>
> regards
> Ramana
>
>>
>> Thanks,
>>
>> Christophe.


Re: [PATCH v2] [PR debug/67192] Fix C loops' back-jump location

2015-10-29 Thread Bernd Schmidt

On 10/23/2015 11:14 AM, Andreas Arnez wrote:

This patch adds a new parameter to c_finish_loop that expclitly
specifies the location to be used for the loop iteration.  All calls to
c_finish_loop are adjusted accordingly.


I think the general principle of this is probably ok.


+  bool is_braced = c_parser_next_token_is (parser, CPP_OPEN_BRACE);
body = c_parser_c99_block_statement (parser);
+  location_t iter_loc = is_braced ? input_location : loc;

token_indent_info next_tinfo
  = get_token_indent_info (c_parser_peek_token (parser));
warn_for_misleading_indentation (while_tinfo, body_tinfo, next_tinfo);

-  c_finish_loop (loc, cond, NULL, body, c_break_label, c_cont_label, true);
+  c_finish_loop (loc, iter_loc, cond, NULL, body, c_break_label,
+c_cont_label, true);


I'm not entirely sure I understand what the is_braced logic is trying to 
achieve. I tried the patch out with the debugger on the testcase you 
provided, and I get slightly strange behaviour in f2:


Starting program: /local/src/egcs/bwork-git/gcc/a.out

Breakpoint 7, f2 () at pr67192.c:32
32  if (last ())
(gdb) cont
Continuing.

Breakpoint 6, f2 () at pr67192.c:31
31while (1)
(gdb)

i.e. the breakpoint on the code inside the loop is reached before the 
while statement itself. This may be the expected behaviour with your 
patch, but I'm not sure it's really desirable for debugging.


In f4 placing a breakpoint on the while (1) just places it on the if 
(last ()) line. The same behaviour appears to occur for both while loops 
with the system gcc-4.8.5. So I think there are some strange 
inconsistencies going on here, and for debugging the old behaviour may 
well be better.


I'd suggest you commit your original patch to fix the misleading-indent 
problem while we sort this out.



Bernd


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #01

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:26 PM, Michael Meissner
 wrote:
> This patch is the rs6000.h changes. It makes the IEEE 128-bit floating point
> type that can go in vector registers a 'vector' type, so that the address code
> in rs6000.c that determines whether to use VSX addressing works. In addition, 
> I
> made IEEE 128-bit floating point tie with vectors and not with other scalar
> floats.
>
> I have built the compiler with this patch applied, and run bootstrap and
> regression tests with all of the patches installed.  Is it ok to install on 
> the
> trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.h (ALTIVEC_VECTOR_MODE): Add IEEE 128-bit
> floating point modes that can go in vector registers.
> (MODES_TIEABLE_P): Move tests for vector modes before tests for
> scalar floating point, so that IEEE 128-bit floating point that
> can go in vector registers bind with vectors and not FP.
> (struct rs6000_args): Add libcall field.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #04

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:36 PM, Michael Meissner
 wrote:
> This patch allows SUBREG's for the reg_or_indexed_operand, which is used when
> you have an integral value in a float/vector register, and you want to move 
> the
> value (either via direct move on power8, or via store).
>
> I have built the compiler with this patch and the previous subpatches (1-3).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/predicates.md (reg_or_indexed_operand): Allow
> SUBREGs.

Okay.

Thanks, David


Re: [PATCH] Fix c-c++-common/torture/vector-compare-1.c on powerpc

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 2:42 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patches powerpc fails for c-c++-common/torture/vector-compare-1.c test.  
> The problem is that vector comparison lowering produces vector of 0s and 1s 
> instead of 0s and -1s.  It doesn't matter if it usage is also lowered (like 
> happens on ARM and i386 with -mno-sse) byt on powerpc we may have comparison 
> of doubles be lowered but following VEC_COND_EXPR not lowered.  It causes 
> wrong VEC_COND_EXPR result.  i checked this patch fixes the test.  Full 
> regression testing on powerpc64le-unknown-linux-gnu is in progress.  OK if no 
> regression?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-10-29  Ilya Enkovich  
>
> * tree-vect-generic.c (do_compare): Use -1 for true
> result instead of 1.
>
>
> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> index d0a4e0f..0b60b15 100644
> --- a/gcc/tree-vect-generic.c
> +++ b/gcc/tree-vect-generic.c
> @@ -161,10 +161,27 @@ static tree
>  do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
> tree bitpos, tree bitsize, enum tree_code code, tree type)
>  {
> +  tree stype = TREE_TYPE (type);
> +
>a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
>b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
>
> -  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
> +  if (TYPE_PRECISION (stype) > 1)
> +{
> +  tree cst_false = build_zero_cst (stype);
> +  tree cst_true;
> +  tree cmp;
> +
> +  if (TYPE_UNSIGNED (stype))
> +   cst_true = TYPE_MAXVAL (stype);
> +  else
> +   cst_true = build_minus_one_cst (stype);
> +
> +  cmp = build2 (code, boolean_type_node, a, b);
> +  return gimplify_build3 (gsi, COND_EXPR, stype, cmp, cst_true, 
> cst_false);
> +}

I think this should unconditionally produce the COND_EXPR and
build cst_true using build_all_ones_cst (stype).

Ok with that change.

Thanks,
Richard.

> +
> +  return gimplify_build2 (gsi, code, stype, a, b);
>  }
>
>  /* Expand vector addition to scalars.  This does bit twiddling


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #02

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:30 PM, Michael Meissner
 wrote:
> This patch changes the switch from -mfloat128-software and -mfloat128-none to
> be a binary switch, -mfloat128 and -mno-float128.  It also provides some of 
> the
> basic setup for IEEE types.  It also removes code I had put in a previous 
> patch
> that doesn't allow IFmode/KFmode to go in any register if -mno-float128 (this
> causes some reload problems).
>
> I have built the compiler with this patch and subpatch #1 installed.  I have
> built the compiler with all 16 subpatches and had no regressions.  Is this
> patch ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.opt (-mfloat128-*): Delete -mfloat128-none
> and -mfloat128-software switches.  Replace them with a binary
> -mfloat128 switch.
> (-mfloat128): Likewise.
>
> * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Allow
> 128-bit floating point types in GPRs, even if the appropriate
> option enabling the type was not used.
> (rs6000_debug_reg_global): Remove -mfloat128-{software,none}
> debugging.
> (rs6000_setup_reg_addr_masks): Do not allow pre-increment and
> pre-decrement on IEEE 128-bit floating point values.
> (rs6000_init_hard_regno_mode_ok): Change test for whether TFmode
> is IEEE 128-bit floating point.
> (rs6000_init_hard_regno_mode_ok): Add reload handlers for IEEE
> 128-bit floating point types that can go in vector registers.
> (rs6000_option_override_internal): Change -mfloat128-none and
> -mfloat128-software to -mfloat128, and move code to be near other
> VSX option handling.
> (rs6000_option_override_internal): Disable -mfloat128 if we don't
> have the Altivec ABI.
> (rs6000_init_builtins): Don't make TFmode use either IFmode or
> KFmode floating point nodes. Instead, have three separate nodes.
> (rs6000_scalar_mode_supported_p): Add support for IFmode to allow
> eventually moving the long double default to IEEE 128-bit floating
> point.
> (rs6000_opt_masks): Add -mfloat128.
> (struct rs6000_opt_var): Fix typo in comment.
>
> * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add -mfloat128 as
> an option that can be turned on via -mcpu=.
>
> * config/rs6000/rs6000-opts.h (enum float128_type_t): Delete, no
> longer used.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #05

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:39 PM, Michael Meissner
 wrote:
> This patch prevents the compiler from calling the IEEE 128-bit emulation
> functions with the vector value in both GPRs and vector registers due to the
> fact that the library function did not have a prototype.
>
> I have built the compiler with this patch and the previous subpatches (1-4).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (init_cumulative_args): Initialize
> libcall field in CUMULATIVE_ARGS.
> (rs6000_function_arg): Treat library functions as if they had
> prototypes to prevent IEEE 128-bit support functions from passing
> arguments in both GPRs and vector registers.
> (rs6000_arg_partial_bytes): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #03

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:33 PM, Michael Meissner
 wrote:
> This patch defines 3 macros to tell the user whether -mfloat128 is enabled or
> not, and whether long double is IBM extended double or IEEE 128-bit floating
> point.
>
> I have built the compiler with this patch and the previous subpatches (1 and
> 2).  I have bootstrapped the compiler with all 16 subpatches installed, and
> there were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
> __FLOAT128__ if -mfloat128. Define __LONG_DOUBLE_IEEE128__ if long
> double is IEEE 128-bit. Define __LONG_DOUBLE_IBM128__ if long
> double is IBM extended double.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #06

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:40 PM, Michael Meissner
 wrote:
> This patch sets up all of the emulation functions.
>
> I have built the compiler with this patch and the previous subpatches (1-4).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (rs6000_init_libfuncs): Split libfunc
> setup into 3 functions: init_float128_ibm, init_float128_ieee, and
> rs6000_init_libfuncs. If -mfloat128, add IFmode functions for all
> of the traditional names that TFmode uses for handling IEEE
> extended double. If -mfloat128, add KFmode functions for all of
> the emulation functions. If -mabi=ieeelongdouble and -mfloat128,
> make TFmode use the same emulation functions as KFmode.
> (init_float128_ibm): Likewise.
> (init_float128_ieee): Likewise.

This seems to change the logic for initializing Darwin and AIX
libfuncs.  It may not be wrong, but I am a little concerned.  I
believe that they were defined unconditionally before and now they
only are defined if TARGET_LONG_DOUBLE_128.

Am I misunderstanding something?

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #08

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:45 PM, Michael Meissner
 wrote:
> This patch adds support for using 'q' or 'Q' as a suffix for __float128
> constants, which is compatible with the existing x86_64 implmenetation of 
> their
> __float128 support.
>
> I have built the compiler with this patch and the previous subpatches (1-7).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (TARGET_C_MODE_FOR_SUFFIX): Define 'q'
> and 'Q' as the suffix to use for IEEE 128-bit floating point
> constants with -mfloat128.
> (rs6000_c_mode_for_suffix): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #10

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:49 PM, Michael Meissner
 wrote:
> This patch is part of the support needed to properly swap IEEE 128-bit 
> floating
> point on little endian systems.  Note, you will need the rs6000.md changes for
> this to become effective.
>
> I have built the compiler with this patch and the previous subpatches (1-9).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): On little
> endian systems generate a ROTATE insn instead of VEC_SELECT for
> IEEE 128-bit floating point types that can go in vector
> registers.
> (chain_contains_only_swaps): Properly swap IEEE 128-bit floating
> point types that can go in vector registers on little endian
> PowerPC systems.
> (mark_swaps_for_removal): Likewise.
> (rs6000_analyze_swaps): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #09

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:47 PM, Michael Meissner
 wrote:
> This patch is the new patch from the last submission. It sets up a hook so 
> that
> the compiler will not allow IBM extended double and IEEE 128-bit floating 
> point
> to intermix in a binary expression without using an explicit conversion.
>
> I have built the compiler with this patch and the previous subpatches (1-8).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (TARGET_INVALID_BINARY_OP): Do not allow
> inter-mixing of IEEE 128-bit floating point with IBM extended
> double floating point.
> (rs6000_invalid_binary_op): Likewise.

Okay.  Assuming that Joseph's concerns are addressed in subsequent patches.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #07

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:43 PM, Michael Meissner
 wrote:
> This patch updates to use the unordered comparison function for IEEE 128-bit
> floating point to mimic the behaviour of SFmode/DFmode using the fcmpu
> instruction.
>
> It also restructures the code to allow a future change to drop in easier.
>
> I have built the compiler with this patch and the previous subpatches (1-6).  
> I
> have bootstrapped the compiler with all 16 subpatches installed, and there 
> were
> no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (rs6000_generate_compare): For IEEE
> 128-bit floating point comparisons, call the unordered comparison
> function instead of the ordered comparison function.
> (rs6000_expand_float128_convert): Deal with operands that are
> memory operands. Restructure the code to use a switch statement on
> the mode. Add support for TFmode defaulting to either IBM extended
> double or IEEE 128-bit floating point. If the underlying types are
> the same, use a move instead of a conversion function.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #11

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:52 PM, Michael Meissner
 wrote:
> This patch changes the mangling for __float128.  I came to the conclusion that
> the current code is so tangled, that it would be better to use U10__float128
> rather than "e".  However, if it is felt that we should go with "e", I can go
> that way as well.
>
> I have built the compiler with this patch and the previous subpatches (1-10).
> I have bootstrapped the compiler with all 16 subpatches installed, and there
> were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.c (rs6000_mangle_type): Use U10__float128
> for IEEE 128-bit floating point.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #12

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:57 PM, Michael Meissner
 wrote:
> This patch is the first of two rs6000.md patches to straighten out the IFmode,
> KFmode, and TFmode support.  Part of the change is to change the iterator 
> names
> to be easier to understand, using IEEE128, IBM128, and FLOAT128 as the
> iterators.  This change, and the next change go through and have parallel 
> insns
> for handling IFmode and TFmode (when -mabi=ibmlongdouble) for IBM extended
> double, and for handling KFmode and TFmode (when -mabi=ieeelongdouble).  The
> idea is to prepare the way in GCC 7.0 to change the default for long double.
>
> I have built the compiler with this patch and the previous subpatches (1-11).
> I have bootstrapped the compiler with all 16 subpatches installed, and there
> were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.md (FLOAT128_SFDFTF): Delete iterator,
> rework IEEE 128-bit floating point insns to deal with TFmode being
> either IBM extended double or IEEE 128-bit floating point.
> (IFKF): Likewise.
> (IBM128): Update iterator to add condition that the mode is IBM
> extended double.
> (IEEE128): New iterator for IEEE 128-bit floating point.
> (TFIFKF): Rename TFIFKF iterator to FLOAT128.
> (FLOAT128): Likewise.
> (signbit2): FLOAT128_IBM_P condition test moved into IBM128
> iterator.
> (neg2): Replace TFIFKF iterator with FLOAT128. Add support
> for TFmode being IEEE 128-bit floating point. Use IEEE128 iterator
> instead of hard coding TFmode or KFmode.
> (negtf2_internal): Likewise.
> (neg2_internal): Likewise.
> (abs2): Likewise.
> (abstf2_internal): Likewise.
> (abs2_internal): Likewise.
> (ieee_128bit_neg2): Likewise.
> (ieee_128bit_neg2_internal): Likewise.
> (ieee_128bit_abs2): Likewise.
> (ieee_128bit_abs2_internal): Likewise.
> (ieee_128bit_nabs2): Likewise.
> (ieee_128bit_nabs2_internal): Likewise.
> (extendiftf2): Add explicit conversions between 128-bit floating
> point types. Drop the old conversions that had become unwieldy.
> (extend2): Likewise.
> (extendifkf2): Likewise.
> (trunc2): Likewise.
> (extendtfkf2): Likewise.
> (fix_trunc2): Likewise.
> (trunciftf2): Likewise.
> (fixuns_trunc2): Likewise.
> (truncifkf2): Likewise.
> (float2): Likewise.
> (trunckftf2): Likewise.
> (floatuns2): Likewise.
> (trunctfif2): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #13

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 1:58 PM, Michael Meissner
 wrote:
> This patch is the second part to allow TFmode to be IBM extended double or 
> IEEE
> 128-bit floating point depending on switches.
>
> I have built the compiler with this patch and the previous subpatches (1-12).
> I have bootstrapped the compiler with all 16 subpatches installed, and there
> were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.md (FP iterator): Allow TFmode to be IEEE
> 128-bit floating point.
> (extenddftf2): Rework 128-bit floating point conversions to
> properly handle -mabi=ieeelongdouble. Merge IFmode, TFmode, and
> KFmode expanders into one function.
> (extenddf2): Likewise.
> (extenddftf2_fprs): Likewise.
> (extenddf2_fprs): Likewise.
> (extenddftf2_vsx): Likewise.
> (extenddf2_vsx): Likewise.
> (extendsftf2): Likewise.
> (extendsf2): Likewise.
> (trunctfdf2): Likewise.
> (truncdf2): Likewise.
> (trunctfdf2_internal1): Likewise.
> (truncdf2_internal1): Likewise.
> (trunctfdf2_internal2): Likewise.
> (truncdf2_internal2): Likewise.
> (trunctfsf2): Likewise.
> (truncsf2): Likewise.
> (trunctfsf2_fprs): Likewise.
> (truncsf2_fprs): Likewise.
> (floatsit2f): Likewise.
> (floatsi2): Likewise.
> (fix_trunc_helper): Likewise.
> (fix_trunc_helper): Likewise.
> (fix_trunctfsi2): Likewise.
> (fix_truncsi2): Likewise.
> (fix_trunctfsi2_fprs): Likewise.
> (fix_truncsi2_fprs): Likewise.
> (fix_trunctfsi2_internal): Likewise.
> (fix_truncsi2_internal): Likewise.
> (fix_trunctfdi2): Likewise.
> (fix_truncdi2): Likewise.
> (fixuns_trunctf2): Likewise.
> (fixuns_trunc2): Likewise.
> (floatditf2): Likewise.
> (floatdi2): Likewise.
> (floatunstf2): Likewise.
> (floatuns): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #14

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 2:00 PM, Michael Meissner
 wrote:
> This patch makes TFmode be fully switchable for comparisons.
>
> I have built the compiler with this patch and the previous subpatches (1-13).
> I have bootstrapped the compiler with all 16 subpatches installed, and there
> were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * config/rs6000/rs6000.md (cmptf_internal1): Use a mode iterator
> to add support for both types (IFmode, TFmode) that support IBM
> extended double.
> (cmp_internal1): Likewise.
> (cmptf_internal2): Likewise.
> (cmp_internal2): Likewise.

Okay.

Thanks, David


Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #15

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 2:01 PM, Michael Meissner
 wrote:
> This patch adds the documentation.
>
> I have built the compiler with this patch and the previous subpatches (1-14).
> I have bootstrapped the compiler with all 16 subpatches installed, and there
> were no regressions.  Is it ok to install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * doc/extend.texi (Floating Types): Document __ibm128 and
> __float128 on PowerPC.
>
> * doc/invoke.texi (RS/6000 and PowerPC Options): Document
> -mfloat128 and -mno-float128.

Okay.

Thanks, David


Re: [PATCH] Pass manager: add support for termination of pass list

2015-10-29 Thread Martin Liška
On 10/29/2015 02:15 PM, Richard Biener wrote:
> On Thu, Oct 29, 2015 at 10:49 AM, Martin Liška  wrote:
>> On 10/28/2015 04:23 PM, Richard Biener wrote:
>>> On Tue, Oct 27, 2015 at 4:30 PM, Martin Liška  wrote:
 On 10/27/2015 03:49 PM, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 1:36 PM, Martin Liška  wrote:
>> On 10/26/2015 02:48 PM, Richard Biener wrote:
>>> On Thu, Oct 22, 2015 at 1:02 PM, Martin Liška  wrote:
 On 10/21/2015 04:06 PM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 1:24 PM, Martin Liška  wrote:
>> On 10/21/2015 11:59 AM, Richard Biener wrote:
>>> On Wed, Oct 21, 2015 at 11:19 AM, Martin Liška  
>>> wrote:
 On 10/20/2015 03:39 PM, Richard Biener wrote:
> On Tue, Oct 20, 2015 at 3:00 PM, Martin Liška  
> wrote:
>> Hello.
>>
>> As part of upcoming merge of HSA branch, we would like to have 
>> possibility to terminate
>> pass manager after execution of the HSA generation pass. The HSA 
>> back-end is implemented
>> as a tree pass that directly emits HSAIL from gimple tree 
>> representation. The pass operates
>> on clones created by HSA IPA pass and the pass manager should 
>> stop execution of further
>> RTL passes.
>>
>> Suggested patch survives bootstrap and regression tests on 
>> x86_64-linux-pc.
>>
>> What do you think about it?
>
> Are you sure it works this way?
>
> Btw, you will miss executing of all the cleanup passes that will
> eventually free memory
> associated with the function.  So I'd rather support a
> TODO_discard_function which
> should basically release the body from the cgraph.

 Hi.

 Agree with you that I should execute all TODOs, which can be 
 easily done.
 However, if I just try to introduce the suggested TODO and handle 
 it properly
 by calling cgraph_node::release_body, then for instance 
 fn->gimple_df, fn->cfg are
 released and I hit ICEs on many places.

 Stopping the pass manager looks necessary, or do I miss something?
>>>
>>> "Stopping the pass manager" is necessary after 
>>> TODO_discard_function, yes.
>>> But that may be simply done via a has_body () check then?
>>
>> Thanks, there's second version of the patch. I'm going to start 
>> regression tests.
>
> As release_body () will free cfun you should pop_cfun () before
> calling it (and thus

 Well, release_function_body calls both push & pop, so I think calling 
 pop
 before cgraph_node::release_body is not necessary.
>>>
>>> (ugh).
>>>
 If tried to call pop_cfun before cgraph_node::release_body, I have 
 cfun still
 pointing to the same (function *) (which is gcc_freed, but cfun != 
 NULL).
>>>
>>> Hmm, I meant to call pop_cfun then after it (unless you want to fix the 
>>> above,
>>> none of the freeing functions should techincally need 'cfun', just add
>>> 'fn' parameters ...).
>>>
>>> I expected pop_cfun to eventually set cfun to NULL if it popped the
>>> "last" cfun.  Why
>>> doesn't it do that?
>>>
> drop its modification).  Also TODO_discard_functiuon should be only 
> set for
> local passes thus you should probably add a gcc_assert (cfun).
> I'd move its handling earlier, definitely before the ggc_collect, 
> eventually
> before the pass_fini_dump_file () (do we want a last dump of the
> function or not?).

 Fully agree, moved here.

>
> @@ -2397,6 +2410,10 @@ execute_pass_list_1 (opt_pass *pass)
>  {
>gcc_assert (pass->type == GIMPLE_PASS
>   || pass->type == RTL_PASS);
> +
> +
> +  if (!gimple_has_body_p (current_function_decl))
> +   return;
>
> too much vertical space.  With popping cfun before releasing the body 
> the check
> might just become if (!cfun) and

 As mentioned above, as release body is symmetric (calling push & pop), 
 the suggested
 guard will not work.
>>>
>>> I suggest to fix it.  If it calls push/pop it should leave with the
>>> original cfun, popping again
>>> should make it NULL?
>>>
>
> @@ -2409,7 +2426,7 @@ execute_pass_list (function *fn, opt_pass *pass)
>  {
>push_cfun (fn);
>execute_pass_list_1 (pass);
> -  if (fn->cfg)
> + 

[PATCH] Remove fold () dispatch from fold_gimple_assign

2015-10-29 Thread Richard Biener

After

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 229518)
+++ gcc/gimple-fold.c   (working copy)
@@ -398,7 +398,10 @@ fold_gimple_assign (gimple_stmt_iterator
 /* If we couldn't fold the RHS, hand over to the generic
fold routines.  */
 if (result == NULL_TREE)
-  result = fold (rhs);
+ {
+   result = fold (rhs);
+   gcc_assert (result == rhs);
+ }
 
 /* Strip away useless type conversions.  Both the NON_LVALUE_EXPR
that may have been added by fold, and "useless" type

passed bootstrap and regtest on x86_64-unknown-linux-gnu I am now
testing the following.

There is still GIMPLE_TERNARY_RHS handling going via fold_ternary,
mostly for [VEC_]COND_EXPR handling not moved to match.pd
(and some others).

Richard.

2015-10-29  Richard Biener  

* gimple-fold.c (fold_gimple_assign): Do not dispatch to
fold () on single RHSs.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 229520)
+++ gcc/gimple-fold.c   (working copy)
@@ -355,8 +355,8 @@ fold_gimple_assign (gimple_stmt_iterator
return val;
  }
  }
-
  }
+
else if (TREE_CODE (rhs) == ADDR_EXPR)
  {
tree ref = TREE_OPERAND (rhs, 0);
@@ -371,6 +371,18 @@ fold_gimple_assign (gimple_stmt_iterator
else if (TREE_CODE (ref) == MEM_REF
 && integer_zerop (TREE_OPERAND (ref, 1)))
  result = fold_convert (TREE_TYPE (rhs), TREE_OPERAND (ref, 0));
+
+   if (result)
+ {
+   /* Strip away useless type conversions.  Both the
+  NON_LVALUE_EXPR that may have been added by fold, and
+  "useless" type conversions that might now be apparent
+  due to propagation.  */
+   STRIP_USELESS_TYPE_CONVERSION (result);
+
+   if (result != rhs && valid_gimple_rhs_p (result))
+ return result;
+ }
  }
 
else if (TREE_CODE (rhs) == CONSTRUCTOR
@@ -394,21 +406,6 @@ fold_gimple_assign (gimple_stmt_iterator
 
else if (DECL_P (rhs))
  return get_symbol_constant_value (rhs);
-
-/* If we couldn't fold the RHS, hand over to the generic
-   fold routines.  */
-if (result == NULL_TREE)
-  result = fold (rhs);
-
-/* Strip away useless type conversions.  Both the NON_LVALUE_EXPR
-   that may have been added by fold, and "useless" type
-   conversions that might now be apparent due to propagation.  */
-STRIP_USELESS_TYPE_CONVERSION (result);
-
-if (result != rhs && valid_gimple_rhs_p (result))
- return result;
-
-   return NULL_TREE;
   }
   break;
 



Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2), Subpatch #16

2015-10-29 Thread David Edelsohn
On Fri, Oct 23, 2015 at 2:03 PM, Michael Meissner
 wrote:
> This patch adds a test to make sure __float128 are passed and returned like
> vector objects, and not as IBM extended double did.
>
> This is the last subpatch of patch #7.  I have bootstrapped the compiler with
> all 16 subpatches installed, and there were no regressions.  Is it ok to
> install in the trunk?
>
> 2015-10-22  Michael Meissner  
>
> * gcc.target/powerpc/float128-call.c: New test for -mfloat128 on
> PowerPC.

Okay.

Thanks, David


Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Jan Hubicka
> 
> IMHO it was always wrong/fragile for backends to look at the actual arguments 
> to
> decide on the calling convention.  The backends should _solely_ rely on
> gimple_call_fntype and its TYPE_ARG_TYPES here.
> 
> Of course then there are varargs ... (not sure if we hit this here).

Yep, you have varargs and K&R prototypes, so it can't work this way.
> 
> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
> what exactly we gain from it (when not done on registers).

I guess gain is really limited to Ada - there are very few cases we do VCE 
otherwise.
(I think we could do more of them).  We can make useless_type_conversion 
NOP/CONVERT
only. That in fact makes quite a sense because those are types with gimple 
operations
on it.  Perhaps also VCE on vectors, but not VCE in general.

Honza
> 
> But I also don't see where we do the stripping mentioned on memory references.
> The match.pd pattern doesn't apply to memory, only in the GENERIC path
> which is guarded with exact type equality.  So I can't see where we end up
> stripping the V_C_E.
> 
> There is one bogus case still in fold-const.c:
> 
> case VIEW_CONVERT_EXPR:
>   if (TREE_CODE (op0) == MEM_REF)
> /* ???  Bogus for aligned types.  */
> return fold_build2_loc (loc, MEM_REF, type,
> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));
> 
>   return NULL_TREE;
> 
> that comment is only in my local tree ... (we lose alignment info that is
> on the original MEM_REF type which may be a smaller one).
> 
> Richard.
> 
> > Honza
> >>
> >>
> >>   * gnat.dg/discr44.adb: New test.
> >>
> >> --
> >> Eric Botcazou
> >
> >> -- { dg-do run }
> >> -- { dg-options "-gnatws" }
> >>
> >> procedure Discr44 is
> >>
> >>   function Ident (I : Integer) return Integer is
> >>   begin
> >> return I;
> >>   end;
> >>
> >>   type Int is range 1 .. 10;
> >>
> >>   type Str is array (Int range <>) of Character;
> >>
> >>   type Parent (D1, D2 : Int; B : Boolean) is record
> >> S : Str (D1 .. D2);
> >>   end record;
> >>
> >>   type Derived (D : Int) is new Parent (D1 => D, D2 => D, B => False);
> >>
> >>   X1 : Derived (D => Int (Ident (7)));
> >>
> >> begin
> >>   if X1.D /= 7 then
> >> raise Program_Error;
> >>   end if;
> >> end;
> >


Fix X - (X / Y) * Y in match.pd

2015-10-29 Thread Marc Glisse

Hello,

before we completely forget about it, this fixes the pattern as discussed 
around https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01459.html .


Note that, as seen in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68008#c1 , we would sometimes 
like to allow a conversion between trunc_div and mult, but that's a 
separate generalization and I did not want to think of the proper 
conditions for its validity.


Bootstrap+regtest on ppc64le-redhat-linux.

2015-10-30  Marc Glisse  

* match.pd (X-(X/Y)*Y): Properly handle conversions and commutativity.

--
Marc GlisseIndex: gcc/match.pd
===
--- gcc/match.pd(revision 229478)
+++ gcc/match.pd(working copy)
@@ -311,24 +311,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 /* X % -Y is the same as X % Y.  */
 (simplify
  (trunc_mod @0 (convert? (negate @1)))
  (if (!TYPE_UNSIGNED (type)
   && !TYPE_OVERFLOW_TRAPS (type)
   && tree_nop_conversion_p (type, TREE_TYPE (@1)))
   (trunc_mod @0 (convert @1
 
 /* X - (X / Y) * Y is the same as X % Y.  */
 (simplify
- (minus (convert1? @0) (convert2? (mult (trunc_div @0 @1) @1)))
- (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
-  && TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (type))
-  (trunc_mod (convert @0) (convert @1
+ (minus (convert1? @2) (convert2? (mult:c (trunc_div @0 @1) @1)))
+ (if (operand_equal_p (@0, @2, 0)
+  && (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)))
+  (convert (trunc_mod @0 @1
 
 /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR,
i.e. "X % C" into "X & (C - 1)", if X and C are positive.
Also optimize A % (C << N)  where C is a power of 2,
to A & ((C << N) - 1).  */
 (match (power_of_two_cand @1)
  INTEGER_CST@1)
 (match (power_of_two_cand @1)
  (lshift INTEGER_CST@1 @2))
 (for mod (trunc_mod floor_mod)


Re: [PATCH] Fix PR67600 testcase

2015-10-29 Thread Jan Hubicka
> 
> Remove the 600k preprocessed headers.
> 
> tested on x86_64 with -m32 and -m64, committed.

Ah, thank you! I had shorter testcase but apparently managed to swap it for 
original one :(

Honza
> 
> Richard.
> 
> 2015-10-28  Richard Biener  
> 
>   * g++.dg/torture/pr67600.C: Remove preprocessed headers in favor
>   of #includes.


Require c99_runtime for builtin-convert-1.c

2015-10-29 Thread Richard Sandiford
...which was failing on *-elf.  This test is making sure that we narrow
maths calls, e.g. sinh -> sinhf, but we can only do that if we know that
the target has a full c99 libm or if we have other proof that the function
we want to generate is available.

Tested on aarch64-none-elf (where it's skipped) and aarch64-linux-gnu
(where it isn't).  OK to install?

Thanks,
Richard


gcc/testsuite/
* gcc.dg/torture/builtin-convert-1.c: Require c99_runtime.

diff --git a/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c 
b/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
index f13d29e..eecd2b3 100644
--- a/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
+++ b/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
@@ -8,6 +8,7 @@
 /* { dg-do link } */
 /* { dg-options "-ffast-math" } */
 /* { dg-add-options c99_runtime } */
+/* { dg-require-effective-target c99_runtime } */
 
 #include "../builtins-config.h"
 



[committed] Remove unnecessary label from tree-call-cdce.c

2015-10-29 Thread Richard Sandiford
Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
Committed as preapproved by Richard B.

Thanks,
Richard


gcc/
* tree-call-cdce.c (shrink_wrap_one_built_in_call): Remove
unnecessary label.

diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index 112a325..e68d03d 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -724,8 +724,6 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   unsigned ci;
   gimple *cond_expr = NULL;
   gimple *cond_expr_start;
-  tree bi_call_label_decl;
-  gimple *bi_call_label;
 
   auto_vec conds;
   gen_shrink_wrap_conditions (bi_call, conds, &nconds);
@@ -774,11 +772,6 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   ci++;
   gcc_assert (cond_expr && gimple_code (cond_expr) == GIMPLE_COND);
 
-  /* Now the label.  */
-  bi_call_label_decl = create_artificial_label (gimple_location (bi_call));
-  bi_call_label = gimple_build_label (bi_call_label_decl);
-  gsi_insert_before (&bi_call_bsi, bi_call_label, GSI_SAME_STMT);
-
   bi_call_in_edge0 = split_block (bi_call_bb, cond_expr);
   bi_call_in_edge0->flags &= ~EDGE_FALLTHRU;
   bi_call_in_edge0->flags |= EDGE_TRUE_VALUE;



[committed] Remove redundant variable from tree-call-cdce.c

2015-10-29 Thread Richard Sandiford
shrink_wrap_one_built_in_call had both guard_bb and guard_bb0.
It looks like an earlier version of the pass may have updated
one of the variables in the while loop, but now they're just
two names for the same thing.

Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
Committed as obvious.

Thanks,
Richard


gcc/
* tree-call-cdce.c (shrink_wrap_one_built_in_call): Remove
guard_bb0 and use guard_bb throughout.

diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index e68d03d..ff5cd9b 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -717,7 +717,7 @@ static bool
 shrink_wrap_one_built_in_call (gcall *bi_call)
 {
   gimple_stmt_iterator bi_call_bsi;
-  basic_block bi_call_bb, join_tgt_bb, guard_bb, guard_bb0;
+  basic_block bi_call_bb, join_tgt_bb, guard_bb;
   edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru;
   edge bi_call_in_edge0, guard_bb_in_edge;
   unsigned tn_cond_stmts, nconds;
@@ -775,22 +775,21 @@ shrink_wrap_one_built_in_call (gcall *bi_call)
   bi_call_in_edge0 = split_block (bi_call_bb, cond_expr);
   bi_call_in_edge0->flags &= ~EDGE_FALLTHRU;
   bi_call_in_edge0->flags |= EDGE_TRUE_VALUE;
-  guard_bb0 = bi_call_bb;
+  guard_bb = bi_call_bb;
   bi_call_bb = bi_call_in_edge0->dest;
-  join_tgt_in_edge_fall_thru = make_edge (guard_bb0, join_tgt_bb,
+  join_tgt_in_edge_fall_thru = make_edge (guard_bb, join_tgt_bb,
   EDGE_FALSE_VALUE);
 
   bi_call_in_edge0->probability = REG_BR_PROB_BASE * ERR_PROB;
   bi_call_in_edge0->count =
-  apply_probability (guard_bb0->count,
+  apply_probability (guard_bb->count,
 bi_call_in_edge0->probability);
   join_tgt_in_edge_fall_thru->probability =
   inverse_probability (bi_call_in_edge0->probability);
   join_tgt_in_edge_fall_thru->count =
-  guard_bb0->count - bi_call_in_edge0->count;
+  guard_bb->count - bi_call_in_edge0->count;
 
   /* Code generation for the rest of the conditions  */
-  guard_bb = guard_bb0;
   while (nconds > 0)
 {
   unsigned ci0;



Re: Fix X - (X / Y) * Y in match.pd

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 4:06 PM, Marc Glisse  wrote:
> Hello,
>
> before we completely forget about it, this fixes the pattern as discussed
> around https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01459.html .
>
> Note that, as seen in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68008#c1
> , we would sometimes like to allow a conversion between trunc_div and mult,
> but that's a separate generalization and I did not want to think of the
> proper conditions for its validity.
>
> Bootstrap+regtest on ppc64le-redhat-linux.

Ok, can you add a comment why we can't use matching captures before
the operand_equal_p?  Just in case somebody stumbles across this...

Richard.

> 2015-10-30  Marc Glisse  
>
> * match.pd (X-(X/Y)*Y): Properly handle conversions and
> commutativity.
>
> --
> Marc Glisse
> Index: gcc/match.pd
> ===
> --- gcc/match.pd(revision 229478)
> +++ gcc/match.pd(working copy)
> @@ -311,24 +311,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* X % -Y is the same as X % Y.  */
>  (simplify
>   (trunc_mod @0 (convert? (negate @1)))
>   (if (!TYPE_UNSIGNED (type)
>&& !TYPE_OVERFLOW_TRAPS (type)
>&& tree_nop_conversion_p (type, TREE_TYPE (@1)))
>(trunc_mod @0 (convert @1
>
>  /* X - (X / Y) * Y is the same as X % Y.  */
>  (simplify
> - (minus (convert1? @0) (convert2? (mult (trunc_div @0 @1) @1)))
> - (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
> -  && TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (type))
> -  (trunc_mod (convert @0) (convert @1
> + (minus (convert1? @2) (convert2? (mult:c (trunc_div @0 @1) @1)))
> + (if (operand_equal_p (@0, @2, 0)
> +  && (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)))
> +  (convert (trunc_mod @0 @1
>
>  /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR,
> i.e. "X % C" into "X & (C - 1)", if X and C are positive.
> Also optimize A % (C << N)  where C is a power of 2,
> to A & ((C << N) - 1).  */
>  (match (power_of_two_cand @1)
>   INTEGER_CST@1)
>  (match (power_of_two_cand @1)
>   (lshift INTEGER_CST@1 @2))
>  (for mod (trunc_mod floor_mod)
>


Re: Require c99_runtime for builtin-convert-1.c

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 4:12 PM, Richard Sandiford
 wrote:
> ...which was failing on *-elf.  This test is making sure that we narrow
> maths calls, e.g. sinh -> sinhf, but we can only do that if we know that
> the target has a full c99 libm or if we have other proof that the function
> we want to generate is available.
>
> Tested on aarch64-none-elf (where it's skipped) and aarch64-linux-gnu
> (where it isn't).  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/testsuite/
> * gcc.dg/torture/builtin-convert-1.c: Require c99_runtime.
>
> diff --git a/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c 
> b/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
> index f13d29e..eecd2b3 100644
> --- a/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
> +++ b/gcc/testsuite/gcc.dg/torture/builtin-convert-1.c
> @@ -8,6 +8,7 @@
>  /* { dg-do link } */
>  /* { dg-options "-ffast-math" } */
>  /* { dg-add-options c99_runtime } */
> +/* { dg-require-effective-target c99_runtime } */
>
>  #include "../builtins-config.h"
>
>


Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Richard Biener
On Thu, Oct 29, 2015 at 4:02 PM, Jan Hubicka  wrote:
>>
>> IMHO it was always wrong/fragile for backends to look at the actual 
>> arguments to
>> decide on the calling convention.  The backends should _solely_ rely on
>> gimple_call_fntype and its TYPE_ARG_TYPES here.
>>
>> Of course then there are varargs ... (not sure if we hit this here).
>
> Yep, you have varargs and K&R prototypes, so it can't work this way.

Well, then I suppose we need to compute the ABI upfront when we gimplify
from the orginal args (like we preserve fntype).  Having a separate fntype
was really meant to make us preserve the ABI throughout the GIMPLE phase...

>>
>> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
>> what exactly we gain from it (when not done on registers).
>
> I guess gain is really limited to Ada - there are very few cases we do VCE 
> otherwise.
> (I think we could do more of them).  We can make useless_type_conversion 
> NOP/CONVERT
> only. That in fact makes quite a sense because those are types with gimple 
> operations
> on it.  Perhaps also VCE on vectors, but not VCE in general.
>
> Honza
>>
>> But I also don't see where we do the stripping mentioned on memory 
>> references.
>> The match.pd pattern doesn't apply to memory, only in the GENERIC path
>> which is guarded with exact type equality.  So I can't see where we end up
>> stripping the V_C_E.
>>
>> There is one bogus case still in fold-const.c:
>>
>> case VIEW_CONVERT_EXPR:
>>   if (TREE_CODE (op0) == MEM_REF)
>> /* ???  Bogus for aligned types.  */
>> return fold_build2_loc (loc, MEM_REF, type,
>> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 
>> 1));
>>
>>   return NULL_TREE;
>>
>> that comment is only in my local tree ... (we lose alignment info that is
>> on the original MEM_REF type which may be a smaller one).
>>
>> Richard.
>>
>> > Honza
>> >>
>> >>
>> >>   * gnat.dg/discr44.adb: New test.
>> >>
>> >> --
>> >> Eric Botcazou
>> >
>> >> -- { dg-do run }
>> >> -- { dg-options "-gnatws" }
>> >>
>> >> procedure Discr44 is
>> >>
>> >>   function Ident (I : Integer) return Integer is
>> >>   begin
>> >> return I;
>> >>   end;
>> >>
>> >>   type Int is range 1 .. 10;
>> >>
>> >>   type Str is array (Int range <>) of Character;
>> >>
>> >>   type Parent (D1, D2 : Int; B : Boolean) is record
>> >> S : Str (D1 .. D2);
>> >>   end record;
>> >>
>> >>   type Derived (D : Int) is new Parent (D1 => D, D2 => D, B => False);
>> >>
>> >>   X1 : Derived (D => Int (Ident (7)));
>> >>
>> >> begin
>> >>   if X1.D /= 7 then
>> >> raise Program_Error;
>> >>   end if;
>> >> end;
>> >


Re: Robustify REAL_MODE_FORMAT

2015-10-29 Thread Bernd Schmidt

On 10/29/2015 04:30 PM, Richard Sandiford wrote:

Make sure that REAL_MODE_FORMAT aborts if it is passed an invalid mode,
rather than stepping beyond the bounds of an array.  It turned out that
some code was passing non-float modes to the real.h routines.



gcc/
* real.h (REAL_MODE_FORMAT): Abort if the mode isn't a
SCALAR_FLOAT_MODE_P.


I'm assuming that the code you mention has already been fixed so that we 
don't trigger the abort. Ok.



Bernd


Fix real_2expN mode arguments in fixed-value.c

2015-10-29 Thread Richard Sandiford
fixed-value.c was passing a fixed-point mode to the floating-point
real_2expN routine.  That didn't cause a problem in practice because
all real_2expN did was check for decimal float modes, but it triggered
a failure with an upcoming patch.

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* fixed-value.c (check_real_for_fixed_mode, fixed_from_string)
(fixed_to_decimal, fixed_convert_from_real)
(real_convert_from_fixed): Fix mode arguments to real_2expN.

diff --git a/gcc/fixed-value.c b/gcc/fixed-value.c
index 39bdebe..a38045f 100644
--- a/gcc/fixed-value.c
+++ b/gcc/fixed-value.c
@@ -64,8 +64,8 @@ check_real_for_fixed_mode (REAL_VALUE_TYPE *real_value, 
machine_mode mode)
 {
   REAL_VALUE_TYPE max_value, min_value, epsilon_value;
 
-  real_2expN (&max_value, GET_MODE_IBIT (mode), mode);
-  real_2expN (&epsilon_value, -GET_MODE_FBIT (mode), mode);
+  real_2expN (&max_value, GET_MODE_IBIT (mode), VOIDmode);
+  real_2expN (&epsilon_value, -GET_MODE_FBIT (mode), VOIDmode);
 
   if (SIGNED_FIXED_POINT_MODE_P (mode))
 min_value = real_value_negate (&max_value);
@@ -127,7 +127,7 @@ fixed_from_string (FIXED_VALUE_TYPE *f, const char *str, 
machine_mode mode)
   || (temp == FIXED_MAX_EPS && ALL_ACCUM_MODE_P (f->mode)))
 warning (OPT_Woverflow,
 "large fixed-point constant implicitly truncated to fixed-point 
type");
-  real_2expN (&base_value, fbit, mode);
+  real_2expN (&base_value, fbit, VOIDmode);
   real_arithmetic (&fixed_value, MULT_EXPR, &real_value, &base_value);
   wide_int w = real_to_integer (&fixed_value, &fail,
GET_MODE_PRECISION (mode));
@@ -158,7 +158,7 @@ fixed_to_decimal (char *str, const FIXED_VALUE_TYPE *f_orig,
   REAL_VALUE_TYPE real_value, base_value, fixed_value;
 
   signop sgn = UNSIGNED_FIXED_POINT_MODE_P (f_orig->mode) ? UNSIGNED : SIGNED;
-  real_2expN (&base_value, GET_MODE_FBIT (f_orig->mode), f_orig->mode);
+  real_2expN (&base_value, GET_MODE_FBIT (f_orig->mode), VOIDmode);
   real_from_integer (&real_value, VOIDmode,
 wide_int::from (f_orig->data,
 GET_MODE_PRECISION (f_orig->mode), sgn),
@@ -1052,7 +1052,7 @@ fixed_convert_from_real (FIXED_VALUE_TYPE *f, 
machine_mode mode,
 
   real_value = *a;
   f->mode = mode;
-  real_2expN (&base_value, fbit, mode);
+  real_2expN (&base_value, fbit, VOIDmode);
   real_arithmetic (&fixed_value, MULT_EXPR, &real_value, &base_value);
 
   wide_int w = real_to_integer (&fixed_value, &fail,
@@ -1104,7 +1104,7 @@ real_convert_from_fixed (REAL_VALUE_TYPE *r, machine_mode 
mode,
   REAL_VALUE_TYPE base_value, fixed_value, real_value;
 
   signop sgn = UNSIGNED_FIXED_POINT_MODE_P (f->mode) ? UNSIGNED : SIGNED;
-  real_2expN (&base_value, GET_MODE_FBIT (f->mode), f->mode);
+  real_2expN (&base_value, GET_MODE_FBIT (f->mode), VOIDmode);
   real_from_integer (&fixed_value, VOIDmode,
 wide_int::from (f->data, GET_MODE_PRECISION (f->mode),
 sgn), sgn);



Robustify REAL_MODE_FORMAT

2015-10-29 Thread Richard Sandiford
Make sure that REAL_MODE_FORMAT aborts if it is passed an invalid mode,
rather than stepping beyond the bounds of an array.  It turned out that
some code was passing non-float modes to the real.h routines.

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* real.h (REAL_MODE_FORMAT): Abort if the mode isn't a
SCALAR_FLOAT_MODE_P.

diff --git a/gcc/real.h b/gcc/real.h
index e65b526..d3b14e5 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -167,7 +167,9 @@ extern const struct real_format *
   (real_format_for_mode[DECIMAL_FLOAT_MODE_P (MODE)\
? (((MODE) - MIN_MODE_DECIMAL_FLOAT)\
   + (MAX_MODE_FLOAT - MIN_MODE_FLOAT + 1)) \
-   : ((MODE) - MIN_MODE_FLOAT)])
+   : GET_MODE_CLASS (MODE) == MODE_FLOAT   \
+   ? ((MODE) - MIN_MODE_FLOAT) \
+   : (gcc_unreachable (), 0)])
 
 #define FLOAT_MODE_FORMAT(MODE) \
   (REAL_MODE_FORMAT (SCALAR_FLOAT_MODE_P (MODE)? (MODE) \



Re: [PING][PATCH][4.9]Backport "Fix register corruption bug in ree"

2015-10-29 Thread Ramana Radhakrishnan
On Thu, Jun 4, 2015 at 2:16 PM, Renlin Li  wrote:
> Ping ~
>
> Can somebody review it? Thank you!
>
> Regards,
> Renlin Li
>
>
> On 16/04/15 10:32, Renlin Li wrote:
>>
>> Ping~
>>
>> Regards,
>> Renlin Li
>>
>> On 16/04/15 10:09, wrote:
>>>
>>> Ping~
>>> Anybody has time to review it?
>>>
>>>
>>> Regards,
>>> Renlin Li
>>>
>>> On 06/02/15 17:48, Renlin Li wrote:

 Hi all,

 This is a backport patch for branch 4.9. You can find the original=20
 patch here:https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00356.html
 And it has been commit on the trunk as r215205.

 This fixes a few libstdc++-v3 test suite failures.
 x86_64 bootstraps Okay, aarch64_be-none-elf libstdc++-v3 tested Okay.

 Okay to commit on branch 4.9?

 Regards,
 Renlin Li

 2015-02-06  Renlin Li

  Backport from mainline
  2014-09-12  Wilco Dijkstra

  * gcc/ree.c (combine_reaching_defs): Ensure inserted copy don't=20
 change
  the number of hard registers.

>

richi - an RM question -

Is this something that can be pulled back to GCC 4.9 branch assuming
testing still shows no regressions  - it breaks aarch64 be on GCC 4.9
...


regards
Ramana


Re: Fix real_2expN mode arguments in fixed-value.c

2015-10-29 Thread Bernd Schmidt

On 10/29/2015 04:28 PM, Richard Sandiford wrote:

* fixed-value.c (check_real_for_fixed_mode, fixed_from_string)
(fixed_to_decimal, fixed_convert_from_real)
(real_convert_from_fixed): Fix mode arguments to real_2expN.


Passing VOIDmode rather than the real mode is a bit of a strange fix. 
Why can't the called function deal with the proper mode? (real_2expN 
also doesn't document the mode argument which should be fixed).



Bernd


[libgomp] task scheduler rewrite and task priorities implementation

2015-10-29 Thread Aldy Hernandez

Yo!

As promised, here is the work implementing a new API for the task 
scheduler, rewriting the scheduler to fit into this new API, and 
implementing the task priorities that are in OpenMP > 4.1.  There are 
also lots of cleanups and documentation.


For the record, task priorities allow a priority clause for tasks such 
that higher priority tasks are preferred.


#pragma omp task priority(999)

First, this patchset is pretty invasive.  My original idea was to just 
insert the tasks in order into the double linked lists, but as you and 
rth pointed out, this wouldn't scale well.  So, instead of a 20 line 
patch, you get 2900 :-).  But, you get a clean interface and lots and 
lots of cleanups.


There were a million ways of doing this, each with its own trade-off, 
but ultimately I had to pick one and go with it.  I've tried to keep the 
common case inlined and fast.  This is the case of only 0-priority items 
in the queues.  It should behave exactly like before, albeit with a 
comparison to see if we're the common case or not.


There are many optimizations we could do:
- Move the allocation of priority nodes outside of the lock.
- Create a cache of priority nodes instead of allocating each
  time.
- Thread the splay tree with additional links to make
  accessing parent, or max, or whatever threads even quicker.
- Move everything into task.c so we could inline everything.
  (I'd prefer not to, and keep things in priority_queue.[ch]).
- Move part of the next/prev priority node links into gomp_task
  (Not sure if this would work without making a mess of things).
- etc etc.

The list is endless.  We could micro optimize this to death.  If at all 
possible, could we concentrate on agreeing on the general 
implementation, making the common case fast, and perhaps 
micro-optimizing as followups?


The only FIXME I have is your suggested use of offsetof() for the 
gomp_task pointer (data) in priority_node.  I really don't see anyway of 
doing this.  Suggestions welcome.


Everything is working.  Tested with no regressions on a 56-core x86-64 
Linux machine with:


OMP_NUM_THREADS=[2 4 5 16 56]

Committed as obvious.  I'm going on vacation.

Aldy

p.s. Just kidding ;-).
commit 5c71901726caa78940864c0a678c41e43fb9fa79
Author: Aldy Hernandez 
Date:   Fri Oct 23 09:04:09 2015 -0700

Priority queues implementation for libgomp tasks.

* Makefile.am (libgomp_la_SOURCES): Add priority_queue.c.
* Makefile.in: Regenerate.
* libgomp.h: Shuffle prototypes and forward definitions around so
priority queues can be defined.
(struct gomp_task): Remove children, next_child, prev_child,
next_queue, prev_queue, next_taskgroup, prev_taskgroup.
Rename num_dependees to num_dependencies.
Add children_queue, task_queue_entry, children_queue_entry,
taskgroup_queue_entry.
(struct gomp_taskgroup): Remove children.
Add taskgroup_queue.
(struct gomp_team): Change task_queue type to a priority queue.
(splay_compare): Define inline.
* oacc-mem.c: Do not include splay-tree.h.
* priority_queue.c: New file.
* priority_queue.h: New file.
* splay-tree.c: Do not include splay-tree.h.
(splay_tree_foreach_internal): New.
(splay_tree_foreach): New.
* splay-tree.h: Become re-entrant if splay_tree_prefix is defined.
(splay_tree_callback): Define typedef.
* target.c (splay_compare): Move to libgomp.h.
* task.c (gomp_init_task): Use memset.
(gomp_clear_parent): Rewrite to work as a callback.
(gomp_task_handle_depend): Rename num_dependees to
num_dependencies.
(GOMP_task): Handle priorities.
(gomp_create_target_task): Use priority queues.
(verify_children_queue): Remove.
(priority_list_upgrade_task): New.
(priority_queue_upgrade_task): New.
(verify_task_queue): Remove.
(priority_list_downgrade_task): New.
(priority_queue_downgrade_task): New.
(gomp_task_run_pre): Use priority queues.
Abstract code out to priority_queue_downgrade_task.
(gomp_task_run_post_handle_dependers): Use priority queues.
(gomp_task_run_post_remove_parent): Same.
(gomp_task_run_post_remove_taskgroup): Same.
(gomp_barrier_handle_tasks): Same.
(GOMP_taskwait): Same.
(gomp_task_maybe_wait_for_dependencies): Same.  Abstract code to
priority-queue_upgrade_task.
(GOMP_taskgroup_start): Use priority queues.
(GOMP_taskgroup_end): Same.
* taskloop.c (GOMP_taskloop): Handle priorities.
* team.c (gomp_new_team): Call priority_queue_init.
(free_team): Call priority_queue_free.
* testsuite/libgomp.c/priority.c: New test.

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 5411278..a3e1c2b 100644
--- a/

Re: Add VIEW_CONVERT_EXPR to operand_equal_p

2015-10-29 Thread Jan Hubicka
> On Thu, Oct 29, 2015 at 4:02 PM, Jan Hubicka  wrote:
> >>
> >> IMHO it was always wrong/fragile for backends to look at the actual 
> >> arguments to
> >> decide on the calling convention.  The backends should _solely_ rely on
> >> gimple_call_fntype and its TYPE_ARG_TYPES here.
> >>
> >> Of course then there are varargs ... (not sure if we hit this here).
> >
> > Yep, you have varargs and K&R prototypes, so it can't work this way.
> 
> Well, then I suppose we need to compute the ABI upfront when we gimplify
> from the orginal args (like we preserve fntype).  Having a separate fntype
> was really meant to make us preserve the ABI throughout the GIMPLE phase...

Hmm, the idea of doing some part of ABI explicitly is definitly nice (at least
the implicit promotions and pass by reference bits), but storing the full
lowlevel info on how to pass argument seems bit steep. You will need to
preserve the RTL containers for parameters that may get non-trivial (PARALLEL)
and precompute all the other information how to get data on stack. 

While playing with the ABi checker I was just looking into this after several
years (when i was cleaning up calls.c) and calls.c basically works by computing
arg_data that holds most of the info you would need (you need also return
argument passing and the hidden argument for structure returns).  You can check
it out - it is fairly non-trivial beast plus it really holds two parallel sets
of infos - tailcall and normal call (because these differ for targets with
register windows). The info also depends on flags used to compile function body
(such as -maccumulate-outgoing-args)

To make something like this a permanent part of GIMPLE would probably need quite
careful re-engineering of the APIs inventing more high-level intermediate
representation to get out of the machine description.  There is not realy 
immediate
benefit from knowing how parameters are housed on stack for gimple optimizers, 
so
perhaps just keeping the type information (after promotions) as the way to 
specify
call conventions is more practical way to go.

Honza

> >> But yes, the VIEW_CONVERT "stripping" is a bit fragile and I don't remember
> >> what exactly we gain from it (when not done on registers).
> >
> > I guess gain is really limited to Ada - there are very few cases we do VCE 
> > otherwise.
> > (I think we could do more of them).  We can make useless_type_conversion 
> > NOP/CONVERT
> > only. That in fact makes quite a sense because those are types with gimple 
> > operations
> > on it.  Perhaps also VCE on vectors, but not VCE in general.
> >
> > Honza
> >>
> >> But I also don't see where we do the stripping mentioned on memory 
> >> references.
> >> The match.pd pattern doesn't apply to memory, only in the GENERIC path
> >> which is guarded with exact type equality.  So I can't see where we end up
> >> stripping the V_C_E.
> >>
> >> There is one bogus case still in fold-const.c:
> >>
> >> case VIEW_CONVERT_EXPR:
> >>   if (TREE_CODE (op0) == MEM_REF)
> >> /* ???  Bogus for aligned types.  */
> >> return fold_build2_loc (loc, MEM_REF, type,
> >> TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 
> >> 1));
> >>
> >>   return NULL_TREE;
> >>
> >> that comment is only in my local tree ... (we lose alignment info that is
> >> on the original MEM_REF type which may be a smaller one).
> >>
> >> Richard.
> >>
> >> > Honza
> >> >>
> >> >>
> >> >>   * gnat.dg/discr44.adb: New test.
> >> >>
> >> >> --
> >> >> Eric Botcazou
> >> >
> >> >> -- { dg-do run }
> >> >> -- { dg-options "-gnatws" }
> >> >>
> >> >> procedure Discr44 is
> >> >>
> >> >>   function Ident (I : Integer) return Integer is
> >> >>   begin
> >> >> return I;
> >> >>   end;
> >> >>
> >> >>   type Int is range 1 .. 10;
> >> >>
> >> >>   type Str is array (Int range <>) of Character;
> >> >>
> >> >>   type Parent (D1, D2 : Int; B : Boolean) is record
> >> >> S : Str (D1 .. D2);
> >> >>   end record;
> >> >>
> >> >>   type Derived (D : Int) is new Parent (D1 => D, D2 => D, B => False);
> >> >>
> >> >>   X1 : Derived (D => Int (Ident (7)));
> >> >>
> >> >> begin
> >> >>   if X1.D /= 7 then
> >> >> raise Program_Error;
> >> >>   end if;
> >> >> end;
> >> >


Re: [PATCH][auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

2015-10-29 Thread Kyrill Tkachov


On 28/10/15 17:45, Kyrill Tkachov wrote:


On 28/10/15 17:23, Christophe Lyon wrote:

Hi Kyrill,


Hi Christophe,



On 26 October 2015 at 12:52, Kyrill Tkachov  wrote:

On 26/10/15 11:28, Bernd Schmidt wrote:

On 10/26/2015 12:12 PM, Bernd Schmidt wrote:


But isn't that balanced by the fact that it doesn't seem to take into
account the gain of removing the inc_insn either? So I think this can't
be right.


Argh, misread the code. The patch is OK with the change I suggested.


Thanks!
Here's what I committed with r229344.


Since this commit, I've noticed:

FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"
with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9


I think this is a testcase issue, I'll look into updating it separately.


as well as ICEs:
   gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)
   gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)

with --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4
and --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

See for a more synthetic view:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html

Can you have a look?


The ICE backtrace is:
0x7a4494 change_address_1
$SRC/gcc/emit-rtl.c:2132
0x7a7453 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
$SRC/gcc/emit-rtl.c:2270
0xab792d gen_lowpart_general(machine_mode, rtx_def*)
$SRC/gcc/rtlhooks.c:90
0xfd721d gen_split_47(rtx_insn*, rtx_def**)
$SRC/gcc/config/arm/arm.md:4336
0x12473f2 split_11
$SRC/gcc/config/arm/arm.md:4331
0x12473f2 split_insns(rtx_def*, rtx_insn*)
$SRC/gcc/config/arm/sync.md:361
0x7af30e try_split(rtx_def*, rtx_insn*, int)
$SRC/gcc/emit-rtl.c:3664
0xa53dc5 split_insn
$SRC/gcc/recog.c:2874
0xa5ccf5 split_all_insns()
$SRC/gcc/recog.c:2964
0xa5cde3 rest_of_handle_split_after_reload
$SRC/gcc/recog.c:3900
0xa5cde3 execute
$SRC/gcc/recog.c:3929


Looks like a latent issue exposed by my patch.
Could you please file a BZ entry if get the chance?



I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68149 for this one.

Kyrill


Thanks,
Kyrill



Thanks,

Christophe.


Kyrill

2015-10-26  Kyrylo Tkachov  

 * auto-inc-dec.c (insert_move_insn_before): Delete.
 (attempt_change): Remember to cost the simple move in the
 FORM_PRE_ADD and FORM_POST_ADD cases.


Bernd







[Docs] Reword the documentation for -fdump-rtl-

2015-10-29 Thread James Greenhalgh

Hi,

The text for -fdump-rtl- is a little misleading about how much you can
derive about pass ordering from the numbers on the dumps. This patch tries
to update the text to clarify that (most of the time) you can quite happily
derive execution order from these pass numbers.

While I'm there I've fixed some missing two-space-after-period issues.

Checked that documentation still builds and looks OK in info.

OK for trunk?

Thanks,
James

---
2015-10-29  James Greenhalgh  

* doc/invoke.texi (fdump-rtl-@var{pass}): Clarify relationship
between pass numbering and execution order.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 76fdc31..148f063 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6665,17 +6665,19 @@ Says to make debugging dumps during compilation at times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file. In case of
+created in the directory of the output file.  In case of
 @option{=@var{filename}} option, the dump is output on the given file
-instead of the pass numbered dump files. Note that the pass number is
-computed statically as passes get registered into the pass manager.
-Thus the numbering is not related to the dynamic order of execution of
-passes.  In particular, a pass installed by a plugin could have a
-number over 200 even if it executed quite early.  @var{dumpname} is
-generated from the name of the output file, if explicitly specified
-and it is not an executable, otherwise it is the basename of the
-source file. These switches may have different effects when
-@option{-E} is used for preprocessing.
+instead of the pass numbered dump files.  Note that the pass number is
+assigned as passes are registered into the pass manager.  Most passes
+will be registered in the order that they will execute and so often the
+number will correspond to the pass execution order.  However, passes
+registered by plugins, which are specific to compilation targets, or
+which are otherwise registered after all the other passes, will be
+numbered higher than a pass named "final", even if they were executed
+earlier.  @var{dumpname} is generated from the name of the output
+file, if explicitly specified and it is not an executable, otherwise it
+is the basename of the source file.  These switches may have different
+effects when @option{-E} is used for preprocessing.
 
 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible


Allow real_format to be passed to more real.h functions

2015-10-29 Thread Richard Sandiford
Most real.h routines used machine modes to specify the format of
an operation and converted that to a float_format * internally.
Some also had alternative versions that accepted a float_format *.

In an upcoming patch it seemed more convenient for the callers
I was adding to use float_format directly, since the callers need
to examine the format themselves for other reasons.  This patch
therefore replaces the machine_mode arguments with a new class that
allows both machine modes and float_format pointers to be used.

Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
OK to install?

Thanks,
Richard


gcc/
* real.h (format_helper): New.
(real_convert, exact_real_truncate, real_from_string3, real_to_target)
(real_from_target, real_nan, real_2expN, real_value_truncate)
(significand_size, real_from_string2, exact_real_inverse)
(exact_real_inverse, real_powi, real_trunc, real_floor, real_ceil)
(real_round, real_isinteger, real_from_integer): Replace
machine_mode arguments with format_helper arguments.
* real.c (exact_real_inverse, real_from_string2, real_from_string3)
(real_from_integer, real_nan, real_2expN, real_convert)
(real_value_truncate, exact_real_truncate, real_to_target)
(real_from_target, significand_size, real_powi, real_trunc)
(real_floor, real_ceil, real_round, real_isinteger): Replace
machine_mode arguments with format_helper arguments.
(real_to_target_fmt, real_from_target_fmt): Delete.
* dfp.h (decimal_real_convert): Replace mode argument with real_format.
* dfp.c (decimal_to_binary, decimal_real_convert): Replace mode
argument with real_format.
* builtins.c (do_real_to_int_conversion): Update type of fn argument.

gcc/java/
* jcf-parse.c (get_constant): Use real_from_target rather than
real_from_target_fmt.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 248c009..583a68e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7275,7 +7275,7 @@ fold_builtin_strlen (location_t loc, tree type, tree arg)
 
 static tree
 do_real_to_int_conversion (tree itype, tree arg,
-  void (*fn) (REAL_VALUE_TYPE *, machine_mode,
+  void (*fn) (REAL_VALUE_TYPE *, format_helper,
   const REAL_VALUE_TYPE *))
 {
   if (TREE_CODE (arg) != REAL_CST || TREE_OVERFLOW (arg))
diff --git a/gcc/dfp.c b/gcc/dfp.c
index ceb43d1..f3a3b6f 100644
--- a/gcc/dfp.c
+++ b/gcc/dfp.c
@@ -343,13 +343,13 @@ decode_decimal128 (const struct real_format *fmt 
ATTRIBUTE_UNUSED,
 
 static void
 decimal_to_binary (REAL_VALUE_TYPE *to, const REAL_VALUE_TYPE *from,
-  machine_mode mode)
+  const real_format *fmt)
 {
   char string[256];
   const decimal128 *const d128 = (const decimal128 *) from->sig;
 
   decimal128ToString (d128, string);
-  real_from_string3 (to, string, mode);
+  real_from_string3 (to, string, fmt);
 }
 
 
@@ -459,15 +459,13 @@ decimal_round_for_format (const struct real_format *fmt, 
REAL_VALUE_TYPE *r)
binary and decimal types.  */
 
 void
-decimal_real_convert (REAL_VALUE_TYPE *r, machine_mode mode,
+decimal_real_convert (REAL_VALUE_TYPE *r, const real_format *fmt,
  const REAL_VALUE_TYPE *a)
 {
-  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
-
   if (a->decimal && fmt->b == 10)
 return;
   if (a->decimal)
-  decimal_to_binary (r, a, mode);
+  decimal_to_binary (r, a, fmt);
   else
   decimal_from_binary (r, a);
 }
diff --git a/gcc/dfp.h b/gcc/dfp.h
index 013de8b..a3653c9 100644
--- a/gcc/dfp.h
+++ b/gcc/dfp.h
@@ -34,7 +34,8 @@ void encode_decimal128 (const struct real_format *fmt, long 
*, const REAL_VALUE_
 int  decimal_do_compare (const REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *, 
int);
 void decimal_real_from_string (REAL_VALUE_TYPE *, const char *);
 void decimal_round_for_format (const struct real_format *, REAL_VALUE_TYPE *);
-void decimal_real_convert (REAL_VALUE_TYPE *, machine_mode, const 
REAL_VALUE_TYPE *);
+void decimal_real_convert (REAL_VALUE_TYPE *, const real_format *,
+  const REAL_VALUE_TYPE *);
 void decimal_real_to_decimal (char *, const REAL_VALUE_TYPE *, size_t, size_t, 
int);
 void decimal_do_fix_trunc (REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *);
 void decimal_real_maxval (REAL_VALUE_TYPE *, int, machine_mode);
diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
index bb6e743..88f1a06 100644
--- a/gcc/java/jcf-parse.c
+++ b/gcc/java/jcf-parse.c
@@ -1061,7 +1061,7 @@ get_constant (JCF *jcf, int index)
long buf = num;
REAL_VALUE_TYPE d;
 
-   real_from_target_fmt (&d, &buf, &ieee_single_format);
+   real_from_target (&d, &buf, &ieee_single_format);
value = build_real (float_type_node, d);
break;
   }
@@ -1079,7 +1079,7 @@ get_constant (JCF *jcf, int index)
else
  buf[0]

[PATCH, trivial] Cleanup fipa-pta constraint dumping

2015-10-29 Thread Tom de Vries

Hi,

Consider testcase:
...
int __attribute__((noinline, noclone))
foo (int *__restrict__ a, int *__restrict__ b)
{
  *a = 1;
  *b = 2;
}

int __attribute__((noinline, noclone))
bar (int *a, int *b)
{
  foo (a, b);
}
...

Using this patch we have this diff in the pta dumpfile:
...
@@ -24,7 +24,7 @@
   Called by: bar/1 (1.00 per call)
   Calls:

-Generating constraints for global initializers
+Generating generic constraints

 ANYTHING = &ANYTHING
 ESCAPED = *ESCAPED
...

I'll commit to trunk if bootstrap and reg-test succeeds, unless there 
are objections.


Thanks,
- Tom
Cleanup fipa-pta constraint dumping

2015-10-29  Tom de Vries  

	* tree-ssa-structalias.c (ipa_pta_execute): Declare variable from as
	unsigned.  Add generic constraints dumping section.  Don't dump
	global initializers constraints dump section if empty.  Don't update
	if unused.
---
 gcc/tree-ssa-structalias.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 4f7a788..3bef607 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -7283,7 +7283,7 @@ ipa_pta_execute (void)
 {
   struct cgraph_node *node;
   varpool_node *var;
-  int from;
+  unsigned int from = 0;
 
   in_ipa_mode = 1;
 
@@ -7295,6 +7295,15 @@ ipa_pta_execute (void)
   fprintf (dump_file, "\n");
 }
 
+  if (dump_file)
+{
+  fprintf (dump_file, "Generating generic constraints\n\n");
+  dump_constraints (dump_file, from);
+  fprintf (dump_file, "\n");
+
+  from = constraints.length ();
+}
+
   /* Build the constraints.  */
   FOR_EACH_DEFINED_FUNCTION (node)
 {
@@ -7323,14 +7332,15 @@ ipa_pta_execute (void)
   get_vi_for_tree (var->decl);
 }
 
-  if (dump_file)
+  if (dump_file
+  && from != constraints.length ())
 {
   fprintf (dump_file,
 	   "Generating constraints for global initializers\n\n");
-  dump_constraints (dump_file, 0);
+  dump_constraints (dump_file, from);
   fprintf (dump_file, "\n");
+  from = constraints.length ();
 }
-  from = constraints.length ();
 
   FOR_EACH_DEFINED_FUNCTION (node)
 {
@@ -7415,8 +7425,8 @@ ipa_pta_execute (void)
 	  fprintf (dump_file, "\n");
 	  dump_constraints (dump_file, from);
 	  fprintf (dump_file, "\n");
+	  from = constraints.length ();
 	}
-  from = constraints.length ();
 }
 
   /* From the constraints compute the points-to sets.  */
-- 
1.9.1



Re: Fix real_2expN mode arguments in fixed-value.c

2015-10-29 Thread Richard Sandiford
Bernd Schmidt  writes:
> On 10/29/2015 04:28 PM, Richard Sandiford wrote:
>>  * fixed-value.c (check_real_for_fixed_mode, fixed_from_string)
>>  (fixed_to_decimal, fixed_convert_from_real)
>>  (real_convert_from_fixed): Fix mode arguments to real_2expN.
>
> Passing VOIDmode rather than the real mode is a bit of a strange fix. 
> Why can't the called function deal with the proper mode? (real_2expN 
> also doesn't document the mode argument which should be fixed).

VOIDmode is the usual real.[hc] way of getting maximum precision,
which I think is the natural choice for these temporary real_values.
The mode only becomes significant when you want to round the result.

Thanks,
Richard



[PATCH][ARM] Fix checking RTL error in cortex_a9_sched_adjust_cost

2015-10-29 Thread Kyrill Tkachov

Hi all,

An arm-none-eabi build with RTL checking and --with-cpu=cortex-a9 fails because
cortex_a9_sched_adjust_cost tries to access the SET_DEST of a PARALLEL.
The correct thing to do is to call single_set on dep, which will return a 
simple SET
that we can take the SET_DEST of or NULL if there's more than one SET.

This patch does that.
The arm-none-eabi build passes.
Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2015-10-29  Kyrylo Tkachov  

* config/arm/arm.c (cortex_a9_sched_adjust_cost): Use reg_set_p to
check for dependencies.
commit bef3ba20485085d702b9a6c8683683db49b547f0
Author: Kyrylo Tkachov 
Date:   Tue Oct 27 18:09:29 2015 +

[ARM] Fix checking RTL error in cortex_a9_sched_adjust_cost

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 08a852d..452b23d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -11649,9 +11649,7 @@ cortex_a9_sched_adjust_cost (rtx_insn *insn, rtx link, rtx_insn *dep, int * cost
 		   case. However this gets modeled as an true
 		   dependency and hence all these checks.  */
 		if (REG_P (SET_DEST (PATTERN (insn)))
-			&& REG_P (SET_DEST (PATTERN (dep)))
-			&& reg_overlap_mentioned_p (SET_DEST (PATTERN (insn)),
-		SET_DEST (PATTERN (dep
+			&& reg_set_p (SET_DEST (PATTERN (insn)), dep))
 		  {
 			/* FMACS is a special case where the dependent
 			   instruction can be issued 3 cycles before


Re: [PATCH, 1/6] Simplify constraint handling

2015-10-29 Thread Tom de Vries

On 29/10/15 14:12, Richard Biener wrote:

On Thu, 29 Oct 2015, Tom de Vries wrote:


>On 29/10/15 12:13, Richard Biener wrote:

> >On Wed, 28 Oct 2015, Tom de Vries wrote:
> >

> > > >On 28/10/15 16:35, Richard Biener wrote:

> > > > > >On Tue, 27 Oct 2015, Tom de Vries wrote:
> > > > > >

> > > > > > > >On 27/10/15 13:24, Tom de Vries wrote:

> > > > > > > > > >Thinking it over a bit more, I realized the constraint

> > > > > >handling started

> > > > > > > > > >to be messy. I've reworked the patch series to simplify that

> > > > > >first.

> > > > > > > > > >
> > > > > > > > > >1Simplify constraint handling
> > > > > > > > > >2Rename make_restrict_var_constraints to
> > > > > > > > > >make_param_constraints
> > > > > > > > > >3Add recursion to make_param_constraints
> > > > > > > > > >4Add handle_param parameter to

> > > > > >create_variable_info_for_1

> > > > > > > > > >5Handle recursive restrict pointer in
> > > > > > > > > >create_variable_info_for_1
> > > > > > > > > >6Handle restrict struct fields recursively
> > > > > > > > > >
> > > > > > > > > >Currently doing bootstrap and regtest on x86_64.
> > > > > > > > > >
> > > > > > > > > >I'll repost the patch series in reply to this message.

> > > > > > > >
> > > > > > > >This patch gets rid of this bit of code in

> > > > >intra_create_variable_infos:

> > > > > > > >...
> > > > > > > >if (restrict_pointer_p)
> > > > > > > > make_constraint_from_global_restrict (p, "PARM_RESTRICT");
> > > > > > > >else
> > > > > > > >..
> > > > > > > >
> > > > > > > >I already proposed to remove it here (
> > > > > > > >https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02426.html   ) but

> > > > >there is a

> > > > > > > >problem with that approach: It can happen that restrict_pointer_p

> > > > >is true,

> > > > > > > >but
> > > > > > > >p->only_restrict_pointers is false. This happens with fipa-pta,

> > > > >when

> > > > > > > >create_function_info_for created a varinfo for the parameter

> > > > >before

> > > > > > > >intra_create_variable_infos was called.
> > > > > > > >
> > > > > > > >The patch handles that case now by setting

> > > > >p->only_restrict_pointers.

> > > > > >
> > > > > >Hmm, but ... restrict only has an effect in non-IPA mode.

> > > >
> > > >Indeed, I also realized that by now.
> > > >
> > > >I wrote attached patch to make that explicit and simplify fipa-pta.
> > > >
> > > >OK for trunk if bootstrap and reg-test succeeds?

>
>First, there was an error in the patch, it tested for flag_ipa_pta (so it also
>affected ealias), but it was supposed to test for in_ipa mode. That is fixed
>in attached version.
>

> >I don't see the patch simplifies anything but only removes spurious
> >settings by adding IMHO redundant checks.

>
>Consider testcase:
>...
>int __attribute__((noinline, noclone))
>foo (int *__restrict__ a, int *__restrict__ b)
>{
>   *a = 1;
>   *b = 2;
>}
>
>int __attribute__((noinline, noclone))
>bar (int *a, int *b)
>{
>   foo (a, b);
>}
>...
>
>The impact of this patch in the pta dump (focusing on the constraints bit) is:
>...
>  Generating constraints for foo (foo)
>
>-foo.arg0 = &PARM_NOALIAS(20)
>-PARM_NOALIAS(20) = NONLOCAL
>-foo.arg1 = &PARM_NOALIAS(21)
>-PARM_NOALIAS(21) = NONLOCAL
>+foo.arg0 = &NONLOCAL
>+foo.arg1 = &NONLOCAL
>...
>
>That's the kind of simplifications I'm trying to achieve.

Yes, but as I said we should refactor things to avoid calling
the intra constraints generation from the IPA path.


Ah, I see.

So, like this? OK for trunk if bootstrap and reg-test succeeds?

Thanks,
- Tom

Add initial constraints in create_function_info_for

2015-10-29  Tom de Vries  

	* tree-ssa-structalias.c (ipa_pta_execute): Add extra arg to call to
	create_function_info_for.  Dump constraints generated during
	create_function_info_for. Move intra_create_variable_infos call and
	function-return-values-escape bit to ...
	(create_function_info_for): ... here, and merge
	intra_create_variable_infos call with argument loop.  Add and handle
	nonlocal_p parameter.
---
 gcc/tree-ssa-structalias.c | 94 --
 1 file changed, 58 insertions(+), 36 deletions(-)

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 3bef607..3a0891c 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -5421,10 +5421,12 @@ count_num_arguments (tree decl, bool *is_varargs)
 }
 
 /* Creation function node for DECL, using NAME, and return the index
-   of the variable we've created for the function.  */
+   of the variable we've created for the function.  If NONLOCAL_p, create
+   initial constraints.  */
 
 static varinfo_t
-create_function_info_for (tree decl, const char *name, bool add_id)
+create_function_info_for (tree decl, const char *name, bool add_id,
+			  bool nonlocal_p)
 {
   struct function *fn = DECL_STRUCT_FUNCTION (decl);
   varinfo_t vi, prev_vi;
@@ -5536,6 +5538,28 @@ cre

Re: [PATCH 00/16] Unit tests framework (v3)

2015-10-29 Thread David Malcolm
On Wed, 2015-10-28 at 12:38 +0100, Bernd Schmidt wrote:
> On 10/27/2015 08:48 PM, David Malcolm wrote:
> > The following patch kit adds a unit tests framework for gcc,
> > as a new subdirectory below gcc/testsuite.
> 
> So, as a general comment I think this would be a very good thing to 
> have, and from a quick look through the tests they look pretty sensible.

Thanks.

> > Like previous versions of the patch kit it uses the Google Test
> > framework, since I've had good experiences with it:
> >http://code.google.com/p/googletest/
> > and like v2 of the kit it embeds a two-file copy of v1.7 of
> > the kit, to avoid adding extra dependencies (one .h file and one
> > .c file).
> 
> How much of this are you actually using? How much effort would be 
> involved in removing the extra prerequisite? Is it just the EXPECT_TRUE 
> etc. macros?

As well as the EXPECT_FOO macros, there's test registration, sometimes
done via TEST, sometimes via TEST_F to support fixtures.  I used
TYPED_TEST for type-parameterizing the wide-int tests (though, to be
fair, I didn't manage to get this properly working; see the comment in
test-wide-int.c).

There may be useful things in gtest for us that I'm not using yet.  For
example, the support for death tests may be useful for testing that e.g.
our checking macros kill the program sanely in the presence of malformed
internal data.

I wanted to reuse the gtest code on the grounds that:

  * the code exists

  * it's maintained (https://github.com/google/googletest
shows the last commit was 10 days ago)

  * it has documentation:
https://github.com/google/googletest/blob/master/googletest/docs/Primer.md
(there's much more that just that)

  * it will be familiar to some people from other projects


That said, if it's a blocker, I can attempt a minimal reimplementation
of the subset of the API that I'm using.


One wart in the v3 patchkit was that I used the default configuration
for gtest.  For example, this implicitly required threading support.
This could be suppressed by defining:

   #define GTEST_HAS_PTHREAD 0

before including gtest.h.

> > v1 of the kit was structured as a frontend, v2 of the kit as
> > a plugin that was built as if it were a frontend.  Both of
> > these approaches were problematic, so this version
> > of the patch kit simply builds as a test case within the
> > plugin.exp suites.
> 
> This is the part I'm least certain about, for two reasons:
>   * They are more like source files than tests, and I think I'd prefer
> to have them alongside the source, in gcc/ rather than in the
> testsuite. This way people are invariable going to fail to notice
> them when they grep for something.

So something like:

  gcc/test-vec.c

or even:

  gcc/vec-tests.c

so that it sorts next to gcc/vec.h in a listing?

i.e.   gcc/FOO-tests.c  for gcc/FOO.c

Maybe the unittests-plugin.c should live within the "gcc" directory?

>   * This uses a plugin into whatever compiler was built, but sometimes
> you can't really set up unit tests that way because what you want
> to test depends on target specifics. What I've often wanted is a
> special test target that gets built with a special machine
> description that has whatever patterns are needed to replicate
> tricky situations in reload or other optimization passes.
> 
> The tests you have so far are focused mostly on high-level gimple/tree 
> tests where this limitation is probably not showing up very much, but I 
> think it would be better to have something that allows us to have more 
> in-depth tests.

I think our long-term goal should be at least 5 styles of test:

(A) end-to-end tests of the compiler: running it on some source and
verifying properties of the diagnostics and/or generated code
(potentially running it).  This is what we have today.

(B) unit tests of individual subsystems: tests that exercise specific
internal APIs e.g. containers like vec<>, hash_map<> etc, also things
like gengtype.  This is a gap in our current test coverage, and what
this patch kit aims to start filling.

(C) and (D): tests of specific passes: tests that accept IR (either
gimple or RTL), and run just one pass, and verify properties of the IR
emitted by the pass, or messages emitted by the pass.  LLVM has a "-R"
remark option, so that a pass can issue remarks about what it's doing (a
new kind of diagnostic):

http://llvm.org/releases/3.5.0/tools/clang/docs/UsersManual.html#opt-rpass

We could implement this kind of testing by implementing gimple and RTL
frontends, and a -R option, which could integrate nicely with DejaGnu,
with
  /* { dg-remark "inlined here" } */
and the like.

This would require finishing the gimple FE, and writing an RTL FE
(independent goals).  Hopefully that would give us test coverage for
dealing with e.g. tricky reload situations.

(E) performance testing


The purpose of this patch kit is (B).  I'm interested in tackling (C)
and (D), but I think those are going to have to wait until ne

[PATCH][PR tree-optimization/67892] Use FSM threader to handle backedges

2015-10-29 Thread Jeff Law



As as been noted in the past, the old jump threader's support for 
threading across a loop backedge introduces significant complexity.  The 
most serious complexity is the need to handle processing statements a 
second time (as we come around the top of the loop).  This requires 
invalidation support, potentially introduces loops in the SSA_NAME_VALUE 
chains, etc.


I've speculated for a while that a backwards threader with a more 
sensible region copying scheme was the right long term direction. 
Sebastian got us started on that direction when he introduced the FSM 
threader.  Essentially it walks def-use chains backwards from a 
conditional to try and find paths where an SSA_NAME has a constant 
value.  Those paths represent potentially optimizable jump threads and 
he's using a SEME region copier to handle the SSA & CFG updates.


BZ67892 is an issue with the old jump threader's backedge handling. 
Essentially when we thread across the backedge we're not completely 
invaliding the SSA_NAME_VALUE chains -- and it's not entirely clear if 
we can properly invalidate them.


So rather than bang my head against the wall writing more code to handle 
cases in the old threader of marginal value I've taken this bug as an 
opportunity to start exploiting the FSM threader more heavily.


The patches to date have been improving the FSM threader to handle cases 
it missed, or where we really didn't want it to change the CFG.  This 
patch actually disables the old threader's backedge handling and 
exclusively relies on the FSM threader to pick up those opportunities.



Bootstrapped and regression tested on x86_64-linux-gnu.  Applied to the 
trunk.


Next step is to start removing the now dead code from the old threader :-)


Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a088c13..430d361 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,21 @@
+2015-10-29  Jeff Law  
+
+   PR tree-optimization/67892
+   * tree-ssa-threadedge.c (simplify_controL_stmt_condition): Fix typo
+   in comment.
+   (thread_through_normal_block): If we have seen a backedge, then
+   do nothing.  No longer call find_jump_threads_backwards here.
+   (thread_across_edge): Use find_jump_threads_backwards to find
+   jump threads if the old style threader was not successful.
+   * tree-ssa-threadbackward.c (get_gimple_control_stmt): Use
+   gsi_last_nondebug_bb.  Return NULL if the block does not end
+   with a control statement.
+   (find_jump_threads_backwards): Setup code moved here from
+   tree-ssa-threadedge.c::thread_through_normal_block.  Accept
+   single edge argument instead of name & block.
+   * tree-ssa-threadbackward.h (find_jump_threads_backwards): Update
+   prototype.
+
 2015-10-29  Richard Biener  
 
PR middle-end/68142
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9a7d545..223554a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2015-10-29  Jeff Law  
+
+PR tree-optimization/67892
+   * gcc.dg/tree-ssa/pr21417: Update expected output.
+   * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Likewise.
+
 2015-10-29  Richard Biener  
 
PR middle-end/68142
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
index b67dd18..fed6b31 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
@@ -49,5 +49,5 @@ L23:
 /* We should thread the backedge to the top of the loop; ie we only
execute the if (expr->common.code != 142) test once per loop
iteration.  */
-/* { dg-final { scan-tree-dump-times "Threaded jump" 1 "dom2" } } */
+/* { dg-final { scan-tree-dump-times "FSM jump thread" 1 "dom2" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
index 2f17517..909009a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-2b.c
@@ -25,6 +25,5 @@ void thread_latch_through_header (void)
 /* Threading the latch to a later point in the loop is safe in this
case.  And we want to thread through the header as well.  These
are both caught by threading in DOM.  */
-/* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp1"} } */
-/* { dg-final { scan-tree-dump-times "Jumps threaded: 1" 0 "dom1"} } */
-/* { dg-final { scan-tree-dump-times "Jumps threaded: 2" 1 "dom1"} } */
+/* { dg-final { scan-tree-dump-not "Jumps threaded" "dom1"} } */
+/* { dg-final { scan-tree-dump-times "Jumps threaded: 2" 1 "vrp1"} } */
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index cfb4ace..90e01df 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -37,19 +37,22 @@ along with GCC; see the file COPYING3.  If not see
 static int max_threaded_paths;
 
 /* Simple helper to get the last statement from BB, which is assumed
-   to be a contro

Re: Fix real_2expN mode arguments in fixed-value.c

2015-10-29 Thread Richard Biener
On October 29, 2015 4:28:05 PM GMT+01:00, Richard Sandiford 
 wrote:
>fixed-value.c was passing a fixed-point mode to the floating-point
>real_2expN routine.  That didn't cause a problem in practice because
>all real_2expN did was check for decimal float modes, but it triggered
>a failure with an upcoming patch.
>
>Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>OK to install?

OK.

Richard.

>Thanks,
>Richard
>
>
>gcc/
>   * fixed-value.c (check_real_for_fixed_mode, fixed_from_string)
>   (fixed_to_decimal, fixed_convert_from_real)
>   (real_convert_from_fixed): Fix mode arguments to real_2expN.
>
>diff --git a/gcc/fixed-value.c b/gcc/fixed-value.c
>index 39bdebe..a38045f 100644
>--- a/gcc/fixed-value.c
>+++ b/gcc/fixed-value.c
>@@ -64,8 +64,8 @@ check_real_for_fixed_mode (REAL_VALUE_TYPE
>*real_value, machine_mode mode)
> {
>   REAL_VALUE_TYPE max_value, min_value, epsilon_value;
> 
>-  real_2expN (&max_value, GET_MODE_IBIT (mode), mode);
>-  real_2expN (&epsilon_value, -GET_MODE_FBIT (mode), mode);
>+  real_2expN (&max_value, GET_MODE_IBIT (mode), VOIDmode);
>+  real_2expN (&epsilon_value, -GET_MODE_FBIT (mode), VOIDmode);
> 
>   if (SIGNED_FIXED_POINT_MODE_P (mode))
> min_value = real_value_negate (&max_value);
>@@ -127,7 +127,7 @@ fixed_from_string (FIXED_VALUE_TYPE *f, const char
>*str, machine_mode mode)
>   || (temp == FIXED_MAX_EPS && ALL_ACCUM_MODE_P (f->mode)))
> warning (OPT_Woverflow,
>"large fixed-point constant implicitly truncated to fixed-point
>type");
>-  real_2expN (&base_value, fbit, mode);
>+  real_2expN (&base_value, fbit, VOIDmode);
>   real_arithmetic (&fixed_value, MULT_EXPR, &real_value, &base_value);
>   wide_int w = real_to_integer (&fixed_value, &fail,
>   GET_MODE_PRECISION (mode));
>@@ -158,7 +158,7 @@ fixed_to_decimal (char *str, const FIXED_VALUE_TYPE
>*f_orig,
>   REAL_VALUE_TYPE real_value, base_value, fixed_value;
> 
>signop sgn = UNSIGNED_FIXED_POINT_MODE_P (f_orig->mode) ? UNSIGNED :
>SIGNED;
>-  real_2expN (&base_value, GET_MODE_FBIT (f_orig->mode),
>f_orig->mode);
>+  real_2expN (&base_value, GET_MODE_FBIT (f_orig->mode), VOIDmode);
>   real_from_integer (&real_value, VOIDmode,
>wide_int::from (f_orig->data,
>GET_MODE_PRECISION (f_orig->mode), sgn),
>@@ -1052,7 +1052,7 @@ fixed_convert_from_real (FIXED_VALUE_TYPE *f,
>machine_mode mode,
> 
>   real_value = *a;
>   f->mode = mode;
>-  real_2expN (&base_value, fbit, mode);
>+  real_2expN (&base_value, fbit, VOIDmode);
>   real_arithmetic (&fixed_value, MULT_EXPR, &real_value, &base_value);
> 
>   wide_int w = real_to_integer (&fixed_value, &fail,
>@@ -1104,7 +1104,7 @@ real_convert_from_fixed (REAL_VALUE_TYPE *r,
>machine_mode mode,
>   REAL_VALUE_TYPE base_value, fixed_value, real_value;
> 
>signop sgn = UNSIGNED_FIXED_POINT_MODE_P (f->mode) ? UNSIGNED : SIGNED;
>-  real_2expN (&base_value, GET_MODE_FBIT (f->mode), f->mode);
>+  real_2expN (&base_value, GET_MODE_FBIT (f->mode), VOIDmode);
>   real_from_integer (&fixed_value, VOIDmode,
>wide_int::from (f->data, GET_MODE_PRECISION (f->mode),
>sgn), sgn);




Re: [Docs] Reword the documentation for -fdump-rtl-

2015-10-29 Thread Sandra Loosemore

On 10/29/2015 09:52 AM, James Greenhalgh wrote:

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 76fdc31..148f063 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6665,17 +6665,19 @@ Says to make debugging dumps during compilation at 
times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file. In case of
+created in the directory of the output file.  In case of
 @option{=@var{filename}} option, the dump is output on the given file
-instead of the pass numbered dump files. Note that the pass number is
-computed statically as passes get registered into the pass manager.
-Thus the numbering is not related to the dynamic order of execution of
-passes.  In particular, a pass installed by a plugin could have a
-number over 200 even if it executed quite early.  @var{dumpname} is
-generated from the name of the output file, if explicitly specified
-and it is not an executable, otherwise it is the basename of the
-source file. These switches may have different effects when
-@option{-E} is used for preprocessing.
+instead of the pass numbered dump files.  Note that the pass number is
+assigned as passes are registered into the pass manager.  Most passes
+will be registered in the order that they will execute and so often the


s/will be registered/are registered/


+number will correspond to the pass execution order.  However, passes


s/will correspond/corresponds/


+registered by plugins, which are specific to compilation targets, or
+which are otherwise registered after all the other passes, will be


This is ambiguous.  Do you mean

"passes registered by plugins (which are specific to compilation targets 
or otherwise registered after all the other passes)"


or

"passes registered by plugins, passes specific to compilation targets, 
or passes that are otherwise registered after all the other passes"


??

In either case, also please s/, will be/ are/.


+numbered higher than a pass named "final", even if they were executed


s/were/are/


+earlier.  @var{dumpname} is generated from the name of the output
+file, if explicitly specified and it is not an executable, otherwise it


file if it is explicitly specified and not an executable, otherwise it


+is the basename of the source file.  These switches may have different
+effects when @option{-E} is used for preprocessing.

 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible


-Sandra the nit-picky



Re: Allow real_format to be passed to more real.h functions

2015-10-29 Thread Richard Biener
On October 29, 2015 4:53:59 PM GMT+01:00, Richard Sandiford 
 wrote:
>Most real.h routines used machine modes to specify the format of
>an operation and converted that to a float_format * internally.
>Some also had alternative versions that accepted a float_format *.
>
>In an upcoming patch it seemed more convenient for the callers
>I was adding to use float_format directly, since the callers need
>to examine the format themselves for other reasons.  This patch
>therefore replaces the machine_mode arguments with a new class that
>allows both machine modes and float_format pointers to be used.
>
>Tested on x86_64-linux-gnu, arm-linux-gnueabi and aarch64-linux-gnu.
>OK to install?

OK.

Richard.

>Thanks,
>Richard
>
>
>gcc/
>   * real.h (format_helper): New.
>   (real_convert, exact_real_truncate, real_from_string3, real_to_target)
>   (real_from_target, real_nan, real_2expN, real_value_truncate)
>   (significand_size, real_from_string2, exact_real_inverse)
>   (exact_real_inverse, real_powi, real_trunc, real_floor, real_ceil)
>   (real_round, real_isinteger, real_from_integer): Replace
>   machine_mode arguments with format_helper arguments.
>   * real.c (exact_real_inverse, real_from_string2, real_from_string3)
>   (real_from_integer, real_nan, real_2expN, real_convert)
>   (real_value_truncate, exact_real_truncate, real_to_target)
>   (real_from_target, significand_size, real_powi, real_trunc)
>   (real_floor, real_ceil, real_round, real_isinteger): Replace
>   machine_mode arguments with format_helper arguments.
>   (real_to_target_fmt, real_from_target_fmt): Delete.
>   * dfp.h (decimal_real_convert): Replace mode argument with
>real_format.
>   * dfp.c (decimal_to_binary, decimal_real_convert): Replace mode
>   argument with real_format.
>   * builtins.c (do_real_to_int_conversion): Update type of fn argument.
>
>gcc/java/
>   * jcf-parse.c (get_constant): Use real_from_target rather than
>   real_from_target_fmt.
>
>diff --git a/gcc/builtins.c b/gcc/builtins.c
>index 248c009..583a68e 100644
>--- a/gcc/builtins.c
>+++ b/gcc/builtins.c
>@@ -7275,7 +7275,7 @@ fold_builtin_strlen (location_t loc, tree type,
>tree arg)
> 
> static tree
> do_real_to_int_conversion (tree itype, tree arg,
>- void (*fn) (REAL_VALUE_TYPE *, machine_mode,
>+ void (*fn) (REAL_VALUE_TYPE *, format_helper,
>  const REAL_VALUE_TYPE *))
> {
>   if (TREE_CODE (arg) != REAL_CST || TREE_OVERFLOW (arg))
>diff --git a/gcc/dfp.c b/gcc/dfp.c
>index ceb43d1..f3a3b6f 100644
>--- a/gcc/dfp.c
>+++ b/gcc/dfp.c
>@@ -343,13 +343,13 @@ decode_decimal128 (const struct real_format *fmt
>ATTRIBUTE_UNUSED,
> 
> static void
> decimal_to_binary (REAL_VALUE_TYPE *to, const REAL_VALUE_TYPE *from,
>- machine_mode mode)
>+ const real_format *fmt)
> {
>   char string[256];
>   const decimal128 *const d128 = (const decimal128 *) from->sig;
> 
>   decimal128ToString (d128, string);
>-  real_from_string3 (to, string, mode);
>+  real_from_string3 (to, string, fmt);
> }
> 
> 
>@@ -459,15 +459,13 @@ decimal_round_for_format (const struct
>real_format *fmt, REAL_VALUE_TYPE *r)
>binary and decimal types.  */
> 
> void
>-decimal_real_convert (REAL_VALUE_TYPE *r, machine_mode mode,
>+decimal_real_convert (REAL_VALUE_TYPE *r, const real_format *fmt,
> const REAL_VALUE_TYPE *a)
> {
>-  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
>-
>   if (a->decimal && fmt->b == 10)
> return;
>   if (a->decimal)
>-  decimal_to_binary (r, a, mode);
>+  decimal_to_binary (r, a, fmt);
>   else
>   decimal_from_binary (r, a);
> }
>diff --git a/gcc/dfp.h b/gcc/dfp.h
>index 013de8b..a3653c9 100644
>--- a/gcc/dfp.h
>+++ b/gcc/dfp.h
>@@ -34,7 +34,8 @@ void encode_decimal128 (const struct real_format
>*fmt, long *, const REAL_VALUE_
>int  decimal_do_compare (const REAL_VALUE_TYPE *, const REAL_VALUE_TYPE
>*, int);
> void decimal_real_from_string (REAL_VALUE_TYPE *, const char *);
>void decimal_round_for_format (const struct real_format *,
>REAL_VALUE_TYPE *);
>-void decimal_real_convert (REAL_VALUE_TYPE *, machine_mode, const
>REAL_VALUE_TYPE *);
>+void decimal_real_convert (REAL_VALUE_TYPE *, const real_format *,
>+ const REAL_VALUE_TYPE *);
>void decimal_real_to_decimal (char *, const REAL_VALUE_TYPE *, size_t,
>size_t, int);
>void decimal_do_fix_trunc (REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *);
> void decimal_real_maxval (REAL_VALUE_TYPE *, int, machine_mode);
>diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
>index bb6e743..88f1a06 100644
>--- a/gcc/java/jcf-parse.c
>+++ b/gcc/java/jcf-parse.c
>@@ -1061,7 +1061,7 @@ get_constant (JCF *jcf, int index)
>   long buf = num;
>   REAL_VALUE_TYPE d;
> 
>-  real_from_target_fmt (&d, &buf, &ieee_single_format);
>+  real_from_target (&d, &buf, &ieee_single_f

Re: Robustify REAL_MODE_FORMAT

2015-10-29 Thread Richard Biener
On October 29, 2015 4:33:17 PM GMT+01:00, Bernd Schmidt  
wrote:
>On 10/29/2015 04:30 PM, Richard Sandiford wrote:
>> Make sure that REAL_MODE_FORMAT aborts if it is passed an invalid
>mode,
>> rather than stepping beyond the bounds of an array.  It turned out
>that
>> some code was passing non-float modes to the real.h routines.
>
>> gcc/
>>  * real.h (REAL_MODE_FORMAT): Abort if the mode isn't a
>>  SCALAR_FLOAT_MODE_P.
>
>I'm assuming that the code you mention has already been fixed so that
>we 
>don't trigger the abort. Ok.

Rather than the weird macro can't we turn real_mode_format to an inline 
function?

Richard.

>
>Bernd




Re: [PATCH 6/9] ENABLE_CHECKING refactoring: generators

2015-10-29 Thread Jeff Law

On 10/18/2015 05:55 PM, Mikhail Maltsev wrote:

On 10/06/2015 03:56 PM, Richard Biener wrote:

The generators should simply unconditionally check (not in generated
files, of course).
And the generated code parts should use flag_checking.

Richard.


genautomata has some macros similar to tree checks, so I avoided changing them.
genconditions for some reason #undef-s ENABLE_CHECKING in the generated code. I
did not look at it in details, but decided to simply #define CHECKING_P to 0 for
consistency.

As for genextract and gengtype, I followed your recommendations, that is, used
flag_checking instead of CHECKING_P in genextract, and always enable debugging
functions in gengtype.
I committed this with the trivial fixes to the comments after the 
#else/#endif lines.


Jeff



[PATCH 1/4] -Wmisleading-indentation: don't warn in presence of entirely blank lines

2015-10-29 Thread David Malcolm
Attempting to bootstrap gcc with -Wmisleading-indentation enabled I ran
into a few failures where the indentation, although bad, was arguably
not misleading.

In regrename.c:scan_rtx_address:

  1308  case PRE_MODIFY:
  1309/* If the target doesn't claim to handle autoinc, this must be
  1310   something special, like a stack push.  Kill this chain.  */
  1311  if (!AUTO_INC_DEC)
  1312action = mark_all_read;
  1313
  1314break;
  ^ this is indented at the same level as the "action =" code,
  but clearly isn't guarded by the if () at line 1311.

In gcc/fortran/io.c:gfc_match_open:

  1997  {
  1998static const char *delim[] = { "APOSTROPHE", "QUOTE", "NONE", 
NULL };
  1999
  2000  if (!is_char_type ("DELIM", open->delim))
  2001goto cleanup;
  2002
  2003if (!compare_to_allowed_values ("DELIM", delim, NULL, NULL,
  2004
open->delim->value.character.string,
  2005"OPEN", warn))
  ^ this is indented with the "goto cleanup;" due to
lines 2000-2001 not being indented enough, but
line 2003 clearly isn't guarded by the
"if (!is_char_type" conditional.

In gcc/function.c:locate_and_pad_parm:

  4118locate->slot_offset.constant = -initial_offset_ptr->constant;
  4119if (initial_offset_ptr->var)
  4120  locate->slot_offset.var = size_binop (MINUS_EXPR, ssize_int (0),
  4121initial_offset_ptr->var);
  4122
  4123  {
  4124tree s2 = sizetree;
  4125if (where_pad != none
  4126&& (!tree_fits_uhwi_p (sizetree)
  4127|| (tree_to_uhwi (sizetree) * BITS_PER_UNIT) % 
round_boundary))
  4128  s2 = round_up (s2, round_boundary / BITS_PER_UNIT);
  4129SUB_PARM_SIZE (locate->slot_offset, s2);
  4130  }
^ this block is not guarded by the
  "if (initial_offset_ptr->var)"
  and the whitespace line (4122) is likely to make a
  human reader of the code read it that way also.

In each case, a blank line separated the guarded code from followup code
that wasn't guarded, and to my eyes, the blank line makes the meaning of
the badly-indented code sufficiently clear that it seems unjustified to
issue a -Wmisleading-indentation warning.

The attached patch suppresses the warning for such a case.

OK for trunk?

gcc/c-family/ChangeLog:
* c-indentation.c (line_is_blank_p): New function.
(separated_by_blank_lines_p): New function.
(should_warn_for_misleading_indentation): Don't warn if the
guarded and unguarded code are separated by one or more entirely
blank lines.

gcc/testsuite/ChangeLog:
* c-c++-common/Wmisleading-indentation.c (fn_40): New function.
---
 gcc/c-family/c-indentation.c   | 58 ++
 .../c-c++-common/Wmisleading-indentation.c | 32 
 2 files changed, 90 insertions(+)

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 5b119f7..d9d4380 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -77,6 +77,42 @@ get_visual_column (expanded_location exploc,
   return true;
 }
 
+/* Determine if the given source line of length LINE_LEN is entirely blank,
+   or contains one or more non-whitespace characters.  */
+
+static bool
+line_is_blank_p (const char *line, int line_len)
+{
+  for (int i = 0; i < line_len; i++)
+if (!ISSPACE (line[i]))
+  return false;
+
+  return true;
+}
+
+/* Helper function for should_warn_for_misleading_indentation.
+   Determines if there are one or more blank lines between the
+   given source lines.  */
+
+static bool
+separated_by_blank_lines_p (const char *file,
+   int start_line, int end_line)
+{
+  gcc_assert (start_line < end_line);
+  for (int line_num = start_line; line_num < end_line; line_num++)
+{
+  int line_len;
+  const char *line = location_get_source_line (file, line_num, &line_len);
+  if (!line)
+   return false;
+
+  if (line_is_blank_p (line, line_len))
+   return true;
+}
+
+  return false;
+}
+
 /* Does the given source line appear to contain a #if directive?
(or #ifdef/#ifndef).  Ignore the possibility of it being inside a
comment, for simplicity.
@@ -333,6 +369,12 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  ;
  foo ();
  ^ DON'T WARN HERE
+
+if (flag)
+   foo ();
+
+   bar ();
+   ^ DON'T WARN HERE: blank line between guarded and unguarded code
   */
   if (next_stmt_exploc.line > body_exploc.line)
 {
@@ -358,6 +400,22 @@ should_warn_for_misleading_i

[PATCH 3/4] Fix misleading indentation in tree-nested.c

2015-10-29 Thread David Malcolm
tree-nested.c has this code:

  2333  for (c = gimple_omp_taskreg_clauses (stmt);
  2334   c;
  2335   c = OMP_CLAUSE_CHAIN (c))
  2336if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE
  2337 || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SHARED)
  2338&& OMP_CLAUSE_DECL (c) == decl)
  2339  break;
  2340if (c == NULL && gimple_code (stmt) != GIMPLE_OMP_TARGET)
  2341  {

which leads to this warning from -Wmisleading-indentation
(justifiably, in my opinion):

../../../src/gcc/tree-nested.c: In function ‘tree_node* 
convert_tramp_reference_stmt(gimple_stmt_iterator*, bool*, walk_stmt_info* ’:
../../../src/gcc/tree-nested.c:2340:8: error: statement is indented as if it 
were guarded by... [-Werror=misleading-indentation]
if (c == NULL && gimple_code (stmt) != GIMPLE_OMP_TARGET)
^
../../../src/gcc/tree-nested.c:2333:6: note: ...this ‘for’ clause, but it is not
  for (c = gimple_omp_taskreg_clauses (stmt);
  ^

This patch fixes the indentation of lines 2340-2360 to make it clear
that they're not part of the "for" loop.

gcc/ChangeLog:
* tree-nested.c (convert_tramp_reference_stmt): Fix indentation.
---
 gcc/tree-nested.c | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index c04c429..a5435f9 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -2337,27 +2337,27 @@ convert_tramp_reference_stmt (gimple_stmt_iterator 
*gsi, bool *handled_ops_p,
   || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_SHARED)
  && OMP_CLAUSE_DECL (c) == decl)
break;
- if (c == NULL && gimple_code (stmt) != GIMPLE_OMP_TARGET)
-   {
- c = build_omp_clause (gimple_location (stmt),
-   i ? OMP_CLAUSE_FIRSTPRIVATE
- : OMP_CLAUSE_SHARED);
- OMP_CLAUSE_DECL (c) = decl;
- OMP_CLAUSE_CHAIN (c) = gimple_omp_taskreg_clauses (stmt);
- gimple_omp_taskreg_set_clauses (stmt, c);
-   }
- else if (c == NULL)
-   {
- c = build_omp_clause (gimple_location (stmt),
-   OMP_CLAUSE_MAP);
- OMP_CLAUSE_DECL (c) = decl;
- OMP_CLAUSE_SET_MAP_KIND (c,
-  i ? GOMP_MAP_TO : GOMP_MAP_TOFROM);
- OMP_CLAUSE_SIZE (c) = DECL_SIZE_UNIT (decl);
- OMP_CLAUSE_CHAIN (c) = gimple_omp_target_clauses (stmt);
- gimple_omp_target_set_clauses (as_a  (stmt),
-c);
-   }
+   if (c == NULL && gimple_code (stmt) != GIMPLE_OMP_TARGET)
+ {
+   c = build_omp_clause (gimple_location (stmt),
+ i ? OMP_CLAUSE_FIRSTPRIVATE
+ : OMP_CLAUSE_SHARED);
+   OMP_CLAUSE_DECL (c) = decl;
+   OMP_CLAUSE_CHAIN (c) = gimple_omp_taskreg_clauses (stmt);
+   gimple_omp_taskreg_set_clauses (stmt, c);
+ }
+   else if (c == NULL)
+ {
+   c = build_omp_clause (gimple_location (stmt),
+ OMP_CLAUSE_MAP);
+   OMP_CLAUSE_DECL (c) = decl;
+   OMP_CLAUSE_SET_MAP_KIND (c,
+i ? GOMP_MAP_TO : GOMP_MAP_TOFROM);
+   OMP_CLAUSE_SIZE (c) = DECL_SIZE_UNIT (decl);
+   OMP_CLAUSE_CHAIN (c) = gimple_omp_target_clauses (stmt);
+   gimple_omp_target_set_clauses (as_a  (stmt),
+  c);
+ }
  }
info->new_local_var_chain = save_local_var_chain;
info->static_chain_added |= save_static_chain_added;
-- 
1.8.5.3



[PATCH 0/4] -Wmisleading-indentation

2015-10-29 Thread David Malcolm
The following patches:
  * tweak -Wmisleading-indentation to suppress a kind of false
positive that occurred in three places in gcc's source
tree, and 

  * fix the two remaining truly misleadingly-indented places
where the warning fires

  * adds -Wmisleading-indentation to -Wall

Bootstrapped®rtested the combination of patches with
x86_64-pc-linux-gnu; config-list.mk build using
gcc6 with -Wmisleading-indentation enabled is in-progress.

OK for trunk?

David Malcolm (4):
  -Wmisleading-indentation: don't warn in presence of entirely blank
lines
  Fix misleading indentation in tree-ssa-loop-unswitch.c
  Fix misleading indentation in tree-nested.c
  Add -Wmisleading-indentation to -Wall

 gcc/c-family/c-indentation.c   | 58 ++
 gcc/c-family/c.opt |  2 +-
 gcc/doc/invoke.texi|  5 +-
 .../c-c++-common/Wmisleading-indentation.c | 32 
 gcc/tree-nested.c  | 42 
 gcc/tree-ssa-loop-unswitch.c   |  2 +-
 6 files changed, 116 insertions(+), 25 deletions(-)

-- 
1.8.5.3


[PATCH 2/4] Fix misleading indentation in tree-ssa-loop-unswitch.c

2015-10-29 Thread David Malcolm
tree-ssa-loop-unswitch.c has this bad indentation at line 452:

   449if (dump_file && (dump_flags & TDF_DETAILS))
   450  fprintf (dump_file, ";; Not unswitching, loop is not expected"
   451   " to iterate\n");
   452  return false;

which leads to this warning from -Wmisleading-indentation (justifiably,
in my opinion):

../../../src/gcc/tree-ssa-loop-unswitch.c: In function ‘bool 
tree_unswitch_outer_loop(loop*)’:
../../../src/gcc/tree-ssa-loop-unswitch.c:452:2: error: statement is indented 
as if it were guarded by... [-Werror=misleading-indentation]
  return false;
  ^
../../../src/gcc/tree-ssa-loop-unswitch.c:449:7: note: ...this ‘if’ clause, but 
it is not
   if (dump_file && (dump_flags & TDF_DETAILS))
   ^

This patch fixes the indentation of the "return false;"

gcc/ChangeLog:
* tree-ssa-loop-unswitch.c (tree_unswitch_outer_loop): Fix
indentation.
---
 gcc/tree-ssa-loop-unswitch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 2edc000..7a287da 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -449,7 +449,7 @@ tree_unswitch_outer_loop (struct loop *loop)
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, ";; Not unswitching, loop is not expected"
 " to iterate\n");
-   return false;
+  return false;
 }
 
   guard = find_loop_guard (loop);
-- 
1.8.5.3



Re: Robustify REAL_MODE_FORMAT

2015-10-29 Thread Richard Sandiford
Richard Biener  writes:
> On October 29, 2015 4:33:17 PM GMT+01:00, Bernd Schmidt
>  wrote:
>>On 10/29/2015 04:30 PM, Richard Sandiford wrote:
>>> Make sure that REAL_MODE_FORMAT aborts if it is passed an invalid
>>mode,
>>> rather than stepping beyond the bounds of an array.  It turned out
>>that
>>> some code was passing non-float modes to the real.h routines.
>>
>>> gcc/
>>> * real.h (REAL_MODE_FORMAT): Abort if the mode isn't a
>>> SCALAR_FLOAT_MODE_P.
>>
>>I'm assuming that the code you mention has already been fixed so that
>>we 
>>don't trigger the abort. Ok.
>
> Rather than the weird macro can't we turn real_mode_format to an inline
> function?

It needs to be an lvalue for things like:

REAL_MODE_FORMAT (TFmode) = &ibm_extended_format;

I suppose we could return a non-const reference, but I'd rather stay
clear of returning those :-)

Thanks,
Richard



  1   2   >