Re: [PATCH] Rewrite NAN and sign handling in frange

2022-09-18 Thread Aldy Hernandez via Gcc-patches
Pushed.  We can address any further changes as follow-ups.

Thanks.
Aldy

On Fri, Sep 16, 2022 at 3:26 PM Aldy Hernandez  wrote:
>
> On Fri, Sep 16, 2022 at 10:33 AM Richard Sandiford
>  wrote:
> >
> > Aldy Hernandez via Gcc-patches  writes:
> > > On Thu, Sep 15, 2022 at 9:06 AM Richard Biener
> > >  wrote:
> > >>
> > >> On Thu, Sep 15, 2022 at 7:41 AM Aldy Hernandez  wrote:
> > >> >
> > >> > Hi Richard.  Hi all.
> > >> >
> > >> > The attatched patch rewrites the NAN and sign handling, dropping both
> > >> > tristates in favor of a pair of boolean flags for NANs, and nothing at
> > >> > all for signs.  The signs are tracked in the range itself, so now it's
> > >> > possible to describe things like [-0.0, +0.0] +NAN, [+0, +0], [-5, +0],
> > >> > [+0, 3] -NAN, etc.
> > >> >
> > >> > There are a lot of changes, as the tristate was quite pervasive.  I
> > >> > could use another pair of eyes.  The code IMO is cleaner and handles
> > >> > all the cases we discussed.
> > >> >
> > >> > Here is an example of the various ranges and how they are displayed:
> > >> >
> > >> > [frange] float VARYING NAN ;; Varying includes NAN
> > >> > [frange] UNDEFINED  ;; Empty set as always
> > >> > [frange] float [] NAN   ;; Unknown sign NAN
> > >> > [frange] float [] -NAN  ;; -NAN
> > >> > [frange] float [] +NAN  ;; +NAN
> > >> > [frange] float [-0.0, 0.0]  ;; All zeros.
> > >> > [frange] float [-0.0, -0.0] NAN ;; -0 or NAN.
> > >> > [frange] float [-5.0e+0, -1.0e+0] +NAN  ;; [-5, -1] or +NAN
> > >> > [frange] float [-5.0e+0, -0.0] NAN  ;; [-5, -0] or +-NAN
> > >> > [frange] float [-5.0e+0, -0.0]  ;; [-5, -0]
> > >> > [frange] float [5.0e+0, 1.0e+1] ;; [5, 10]
> > >> >
> > >> > We could represent an unknown sign with +NAN -NAN if preferred.
> > >>
> > >> maybe -+NAN or +-NAN?  I prefer to somehow show both signs for clarity
> > >
> > > Sure.
> > >
> > >>
> > >> >
> > >> > Notice the NAN signs are decoupled from the range, so we can represent
> > >> > a negative range with a positive NAN.  For this range,
> > >> > frange::known_bit() would return false, as only when the signs of the
> > >> > NANs and range agree can we be certain.
> > >> >
> > >> > There is no longer any pessimization of ranges for intersects
> > >> > involving NANs.  Also, union and intersect work with signed zeros:
> > >> >
> > >> > //   [-0,  x] U [+0,  x] => [-0,  x]
> > >> > //   [ x, -0] U [ x, +0] => [ x, +0]
> > >> > //   [-0,  x] ^ [+0,  x] => [+0,  x]
> > >> > //   [ x, -0] ^ [ x, +0] => [ x, -0]
> > >> >
> > >> > The special casing for signed zeros in the singleton code is gone in
> > >> > favor of just making sure the signs in the range agree, that is
> > >> > [-0, -0] for example.
> > >> >
> > >> > I have removed the idea that a known NAN is a "range", so a NAN is no
> > >> > longer in the endpoints itself.  Requesting the bound of a known NAN
> > >> > is a hard fail.  For that matter, we don't store the actual NAN in the
> > >> > range.  The only information we have are the set of boolean flags.
> > >> > This way we make sure nothing seeps into the frange.  This also means
> > >> > it's explicit that we don't track anything but the sign in NANs.  We
> > >> > can revisit this if we desire to track signalling or whatever
> > >> > concoction y'all can imagine.
> > >> >
> > >> > All in all, I'm quite happy with this.  It does look better, and we
> > >> > handle all the corner cases we couldn't before.  Thanks for the
> > >> > suggestion.
> > >> >
> > >> > Regstrapped with mpfr tests on x86-64 and ppc64le Linux.  Selftests
> > >> > were also run with -ffinite-math-only on x86-64.
> > >> >
> > >> > At Jakub's suggestion, I built lapack with associated tests.  They
> > >> > pass on x86-64 and ppc64le Linux with no regressions from mainline.
> > >> > As a sanity check, I also ran them for -ffinite-math-only on x86 which
> > >> > (as expected) returned:
> > >> >
> > >> > NaN arithmetic did not perform per the ieee spec
> > >> >
> > >> > Otherwise, all tests pass for -ffinite-math-only.
> > >> >
> > >> > How does this look?
> > >>
> > >> Overall it looks good.
> > >>
> > >> Reading ::intersect and ::union I find it less clear to spread out the 
> > >> _nan
> > >> cases into separate functions.
> > >
> > > OK, will inline them.
> > >
> > >>
> > >> Can you add a comment to frange that its representation is
> > >> a single value-range specified by m_type, m_min, m_max
> > >> unioned with the set of { -NaN, +NaN }?  Because somehow
> > >> the ::undefined_p vs. m_type == VR_UNDEFINED checks are
> > >> a bit confusing to the occasional reader can we instead use
> > >> ::nan_p to complement ::undefined_p?
> > >
> > > Wouldn't that just make nan_p the same as known_nan?  Speaking of
> > > which, I'm not a big fan of known_nan.  Perhaps we should rename all
> > > the known_foo variants to foo_p variants?  Or...ma

Re: [Patch] OpenMP, libgomp, gimple: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices

2022-09-18 Thread Marcel Vollweiler

Hi Jakub,

The last version of this patch was slightly adapted to the latest changes of the
device-specific environment variable syntax
(https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601145.html), also
considering the latest related bug fixes (commits 994ea892bd02d and 
7d37c7f67c1bb).

The new patch version was bootstrapped and tested on x86_64-linux with nvptx and
amdgcn offloading without regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
This patch adds support for omp_get_max_teams, omp_set_num_teams, and
omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of
device-specific ICV values (specified as environment variables or changed on a
device). In order to reuse device-specific ICV values, a copy back mechanism is
implemented that copies ICV values back from device to the host. 

gcc/ChangeLog:

* gimplify.cc (optimize_target_teams): Set initial num_teams_upper
to "-2" instead of "1" for non-existing num_teams clause in order to
disambiguate from the case of an existing num_teams clause with value 1.

libgomp/ChangeLog:

* config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to
allow processing of device-specific values.
(omp_set_teams_thread_limit): Likewise.
(ialias): Likewise.
* config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise.
(omp_set_teams_thread_limit): Likewise.
(ialias): Likewise.
* icv-device.c (omp_get_teams_thread_limit): Likewise.
(ialias): Likewise.
(omp_set_teams_thread_limit): Likewise.
* icv.c (omp_set_teams_thread_limit): Removed.
(omp_get_teams_thread_limit): Likewise.
(ialias): Likewise.
* target.c (get_gomp_offload_icvs): Added teams_thread_limit_var
handling.
(gomp_load_image_to_device): Added a size check for the ICVs struct
variable.
(gomp_copy_back_icvs): New function that is used in GOMP_target_ext to
copy back the ICV values from device to host.
(GOMP_target_ext): Update the number of teams and threads in the kernel
args also considering device-specific values.
* testsuite/libgomp.c-c++-common/icv-4.c: Bugfix.
* testsuite/libgomp.c-c++-common/icv-5.c: Extended.
* testsuite/libgomp.c-c++-common/icv-6.c: Extended.
* testsuite/libgomp.c-c++-common/icv-7.c: Extended.
* testsuite/libgomp.c-c++-common/icv-9.c: New test.
* testsuite/libgomp.fortran/icv-5.f90: New test.
* testsuite/libgomp.fortran/icv-6.f90: New test.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/target-teams-1.c: Adapt expected values for
num_teams from "1" to "-2" in cases without num_teams clause.
* g++.dg/gomp/target-teams-1.C: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index dcdc852..b393ed8 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -14153,7 +14153,7 @@ optimize_target_teams (tree target, gimple_seq *pre_p)
   struct gimplify_omp_ctx *target_ctx = gimplify_omp_ctxp;
 
   if (teams == NULL_TREE)
-num_teams_upper = integer_one_node;
+num_teams_upper = build_int_cst (integer_type_node, -2);
   else
 for (c = OMP_TEAMS_CLAUSES (teams); c; c = OMP_CLAUSE_CHAIN (c))
   {
diff --git a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c 
b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
index 51b8d48..74d60e1 100644
--- a/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/target-teams-1.c
@@ -81,5 +81,5 @@ foo (int a, int b, long c, long d)
 /* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */
 /* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */
 /* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */
-/* { dg-final { scan-tree-dump-times "num_teams\\(1\\)" 2 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "num_teams\\(-2\\)" 2 "gimple" } } */
 /* { dg-final { scan-tree-dump-times "thread_limit\\(1\\)" 0 "gimple" } } */
diff --git a/gcc/testsuite/g++.dg/gomp/target-teams-1.C 
b/gcc/testsuite/g++.dg/gomp/target-teams-1.C
index f78a608..29e5597 100644
--- a/gcc/testsuite/g++.dg/gomp/target-teams-1.C
+++ b/gcc/testsuite/g++.dg/gomp/target-teams-1.C
@@ -88,5 +88,5 @@ foo (int a, int b, long c, long d)
 /* { dg-final { scan-tree-dump-times "thread_limit\\(-1\\)" 3 "gimple" } } */
 /* { dg-final { scan-tree-dump-times "num_teams\\(0\\)" 4 "gimple" } } */
 /* { dg-final { scan-tree-dump-times "thread_limit\\(0\\)" 6 "gimple" } } */
-/* { dg-final { scan-tree-dump

Re: [PATCH] Document -fexcess-precision=16 in tm.texi

2022-09-18 Thread Palmer Dabbelt

On Fri, 09 Sep 2022 02:46:40 PDT (-0700), Palmer Dabbelt wrote:

I just happened to stuble on this one while trying to sort out the
RISC-V bits.

gcc/ChangeLog

* doc/tm.texi (TARGET_C_EXCESS_PRECISION): Add 16.
---
 gcc/doc/tm.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 858bfb80cec..7590924f2ca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1009,7 +1009,7 @@ of the excess precision explicitly added.  For
 @code{EXCESS_PRECISION_TYPE_FLOAT16}, and
 @code{EXCESS_PRECISION_TYPE_FAST}, the target should return the
 explicit excess precision that should be added depending on the
-value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{]}}.
+value set for @option{-fexcess-precision=@r{[}standard@r{|}fast@r{|}16@r{]}}.
 Note that unpredictable explicit excess precision does not make sense,
 so a target should never return @code{FLT_EVAL_METHOD_UNPREDICTABLE}
 when @var{type} is @code{EXCESS_PRECISION_TYPE_STANDARD},


Just pinging this one as I'm not sure if it's OK to self-approve -- no 
rush on my end, I already figured it out so I don't need the 
documentation any more.


Re: [PATCH] RISC-V: Don't try to vectorize tree-ssa/gen-vect-34.c

2022-09-18 Thread Palmer Dabbelt

On Fri, 02 Sep 2022 18:28:10 PDT (-0700), Palmer Dabbelt wrote:

We don't yet support vectorization on RISC-V.

gcc/testsuite/ChangeLog

* gcc.dg/tree-ssa/gen-vect-34.c: Skip RISC-V targets.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
index 8d2d36401fe..41877e05efd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
@@ -13,4 +13,4 @@ float summul(int n, float *arg1, float *arg2)
 return res1;
 }

-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* riscv*-*-* } } } } } */


Committed.


Re: [PING^5] nvptx: Allow '--with-arch' to override the default '-misa' (was: nvptx multilib setup)

2022-09-18 Thread Tom de Vries via Gcc-patches

On 8/6/22 21:20, Thomas Schwinge wrote:

Hi Tom!



Hi Thomas,

thanks for doing this.

Series approved.

As I mentioned, I'm not completely happy with the multilib name, but I 
don't think it makes sense to post-pone approval for this.


Thanks,
- Tom


Ping.


Grüße
  Thomas


On 2022-07-27T17:48:58+0200, I wrote:

Hi Tom!

Ping.


Grüße
  Thomas


On 2022-07-20T14:46:03+0200, I wrote:

Hi Tom!

Ping.


Grüße
  Thomas


On 2022-07-13T10:42:44+0200, I wrote:

Hi Tom!

Ping.


Grüße
  Thomas


On 2022-07-05T16:59:23+0200, I wrote:

Hi Tom!

Ping.


Grüße
  Thomas


On 2022-06-15T23:18:10+0200, I wrote:

Hi Tom!

On 2022-05-13T16:20:14+0200, I wrote:

On 2022-02-04T13:09:29+0100, Tom de Vries via Gcc  wrote:

On 2/4/22 08:21, Thomas Schwinge wrote:

On 2022-02-03T13:35:55+, "vries at gcc dot gnu.org via Gcc-bugs" 
 wrote:

I've tested this using (recommended) driver 470.94 on boards:



while iterating over dimensions { -mptx=3.1 , -mptx=6.3 } x { GOMP_NVPTX_JIT=-O0, 
 }.


Do you use separate (nvptx-none offload target only?) builds for
different '-mptx' variants (likewise: '-misa'), or have you hacked up the
multilib configuration?


Neither, I'm using --target_board=unix/foffload= for that.


ACK, I see.  So these flags then only affect GCC/nvptx code generation
for the actual user code (here: GCC libgomp test cases), but for the
GCC/nvptx target libraries (such as: libc, libm, libgfortran, libgomp --
the latter especially relevant for OpenMP), it uses PTX code from one of
the two "pre-compiled" GCC/nvptx multilibs: default or '-mptx=3.1'.

Meaning, one can't just use such a flag for "completely building code"
for a specific configuration.  Random example,
'-foffload-options=nvptx-none=-march=sm_75': as GCC/nvptx target
libraries aren't being built for '-march=sm_75' multilib,
'-foffload-options=nvptx-none=-march=sm_75' uses the default multilib,
which isn't '-march=sm_75'.



   ('gcc/config/nvptx/t-nvptx:MULTILIB_OPTIONS'

etc., I suppose?)  Should we add a few representative configurations to
be built by default?  And/or, should we have a way to 'configure' per
user needs (I suppose: '--with-multilib-list=[...]', as supported for a
few other targets?)?  (I see there's also a new
'--with-multilib-generator=[...]', haven't looked in detail.)  No matter
which way: again, combinatorial explosion is a problem, of course...


As far as I know, the gcc build doesn't finish when switching default to
higher than sm_35, so there's little point to go to a multilib setup at
this point.  But once we fix that, we could reconsider, otherwise,
things are likely to regress again.


As far as I remember, several issues have been fixed.  Still waiting for
Roger's "middle-end: Support ABIs that pass FP values as wider integers"
or something similar, but that PR104489 issue is being worked around by
"Limit HFmode support to mexperimental", if I got that right.

Now I'm not suggesting we should now enable all or any random GCC/nvptx
multilibs, to get all these variants of GCC/nvptx target libraries built;
especially also given that GCC/nvptx code generation currently doesn't
make too much use of the new capabilities.

However, we do have a specific request that a customer would like to be
able to change at GCC 'configure' time the GCC/nvptx default multilib
(including that being used for building corresponding GCC/nvptx target
libraries).

Per 'gcc/doc/install.texi', I do see that some GCC targets allow for
GCC 'configure'-time '--with-multilib-list=[...]', or
'--with-multilib-generator=[...]', and I suppose we could be doing
something similar?  But before starting implementing, I'd like your
input, as you'll be the one to approve in the end.  And/or, maybe you've
already made up your own ideas about that?


So, instead of "random GCC/nvptx multilib configuration" (last
paragraph), I've come up with a way to implement our customer's request
(second last paragraph): 'configure' GCC/nvptx '--with-arch=sm_70'.

I think I've implemented this in a way so that "random GCC/nvptx multilib
configuration" may eventually be implemented on top of that.  For easy
review/testing I've split my changes into three commits, see attached
"nvptx: Make default '-misa=sm_30' explicit",
"nvptx: Introduce dummy multilib option for default '-misa=sm_30'",
"nvptx: Allow '--with-arch' to override the default '-misa'".

To the best of my knowledge, the first two patches do not change any
user-visible behavior (I generally 'diff'ed target libraries, and
compared a good number of 'gcc -print-multi-directory [flags]'), and
likewise with the third patch, given implicit (default) or explicit
'--with-arch=sm_30', and that with '--with-arch=sm_70', for example, the
'-misa=sm_70' multilib variants are used for implicit (default) or
explicit '-misa=sm_70' or higher, and the '-misa=sm_30' multilib variants
are used for explicit lower '-misa'.

What do you think, OK to push to master branch?


Grüße
  Thomas



-
Siemens Electronic Design A

Re: [PATCH v4] eliminate mutex in fast path of __register_frame

2022-09-18 Thread Dimitar Dimitrov
On Fri, Sep 16, 2022 at 12:19:36PM +0200, Thomas Neumann via Gcc-patches wrote:
> The __register_frame/__deregister_frame functions are used to register
> unwinding frames from JITed code in a sorted list. That list itself
> is protected by object_mutex, which leads to terrible performance
> in multi-threaded code and is somewhat expensive even if single-threaded.
> There was already a fast-path that avoided taking the mutex if no
> frame was registered at all.
> 
> This commit eliminates both the mutex and the sorted list from
> the atomic fast path, and replaces it with a btree that uses
> optimistic lock coupling during lookup. This allows for fully parallel
> unwinding and is essential to scale exception handling to large
> core counts.
> 
> Changes since v3:
> - Avoid code duplication by adding query mode to classify_object_over_fdes
> - Adjust all comments as requested
> 
> libgcc/ChangeLog:
> 
> * unwind-dw2-fde.c (release_registered_frames): Cleanup at shutdown.
> (__register_frame_info_table_bases): Use btree in atomic fast path.
> (__deregister_frame_info_bases): Likewise.
> (_Unwind_Find_FDE): Likewise.
> (base_from_object): Make parameter const.
> (classify_object_over_fdes): Add query-only mode.
> (get_pc_range): Compute PC range for lookup.
> * unwind-dw2-fde.h (last_fde): Make parameter const.
> * unwind-dw2-btree.h: New file.
> ---
>  libgcc/unwind-dw2-btree.h | 953 ++
>  libgcc/unwind-dw2-fde.c   | 195 ++--
>  libgcc/unwind-dw2-fde.h   |   2 +-
>  3 files changed, 1098 insertions(+), 52 deletions(-)
>  create mode 100644 libgcc/unwind-dw2-btree.h
> 

Hi Thomas,

This patch broke avr and pru-elf cross builds:
  gcc/libgcc/unwind-dw2-fde.c:680:28: error: unknown type name ‘uintptr_t’
  680 |uintptr_t *range)

Should uintptr_t be replaced with __UINTPTR_TYPE__? Such change fixes the
above broken builds for me. But I'm not sure how valid it is for that
part of libgcc.

Other embedded targets like arm-none-eabi are not broken because they
overwrite LIB2ADDEH, and consequently unwind-dw2-fde.c is not built for
them.

Regards,
Dimitar


[PATCH] Improve sorry message for -fzero-call-used-regs

2022-09-18 Thread Torbjörn SVENSSON via Gcc-patches
When the -fzero-call-used-regs command line option is used with an
unsupported value, indicate that it's a value problem instead of an
option problem.

Without the patch, the error is:
In file included from gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
gcc/testsuite/c-c++-common/zero-scratch-regs-1.c: In function 'foo':
gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1: sorry, unimplemented: 
'-fzero-call-used-regs' not supported on this target
   10 | }
  | ^

With the patch, the error would be like this:
 In file included from gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
gcc/testsuite/c-c++-common/zero-scratch-regs-1.c: In function 'foo':
gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1: sorry, unimplemented: 
Argument 'all-arg' is not supported for '-fzero-call-used-regs' on this target
   10 | }
  | ^

2022-09-18  Torbjörn SVENSSON  

gcc/ChangeLog:

* targhooks.cc (default_zero_call_used_regs): Improve sorry
message.

Signed-off-by: Torbjörn SVENSSON  
---
 gcc/targhooks.cc | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index b15ae19bcb6..8bfbc1d18f6 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -93,6 +93,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple.h"
 #include "cfgloop.h"
 #include "tree-vectorizer.h"
+#include "options.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -1181,9 +1182,21 @@ default_zero_call_used_regs (HARD_REG_SET 
need_zeroed_hardregs)
   static bool issued_error;
   if (!issued_error)
{
+ const char *name = NULL;
+ for (unsigned int i = 0; zero_call_used_regs_opts[i].name != NULL;
+  ++i)
+   if (flag_zero_call_used_regs == zero_call_used_regs_opts[i].flag)
+ {
+   name = zero_call_used_regs_opts[i].name;
+   break;
+ }
+
+ if (!name)
+   name = "";
+
  issued_error = true;
- sorry ("%qs not supported on this target",
-"-fzero-call-used-regs");
+ sorry ("Argument %qs is not supported for %qs on this target",
+name, "-fzero-call-used-regs");
}
 }
 
-- 
2.25.1



Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Mikael Morin

Le 18/09/2022 à 08:12, Richard Biener a écrit :

On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  wrote:


Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :


I have a concern about this part, though.  My understanding at the
time was that it is not possible to clobber an individual array
element, but that this clobbers anything off the pointer that this
is based on.


Well, we need the middle-end guys to give a definitive answer on this
topic, but I think it would be a very penalizing limitation if that was
the case.  I have assumed that the clobber spanned the value it was
applied on, neither more nor less, so just the array element in case of
array elements.


There is IL verification that the LHS of a CLOBBER is either
a declaration or a pointer dereference, no array or component
selection is allowed there.  Now, nothing should go wrong here,
but we might eventually just drop those CLOBBERs or ICE if
we frontend hands us an "invalid" one.



Obviously I have assumed too much here; it's probably best to drop this 
patch.
It is unfortunate as there is some desirable behavior within reach here. 
 The test shows that the patch permits the elimination of a useless 
store.  And IL verification doesn't seem that upset with it by the way.
Does *(&a[1]) count as a pointer dereference?  Even in the original dump 
it is already simplified to a straight a[1].


Re: [PATCH v4] eliminate mutex in fast path of __register_frame

2022-09-18 Thread Thomas Neumann via Gcc-patches

Hi Dimitar,


This patch broke avr and pru-elf cross builds:
   gcc/libgcc/unwind-dw2-fde.c:680:28: error: unknown type name ‘uintptr_t’
   680 |uintptr_t *range)

Should uintptr_t be replaced with __UINTPTR_TYPE__? Such change fixes the
above broken builds for me. But I'm not sure how valid it is for that
part of libgcc.


thanks for notifying me, I was not aware that uintptr_t is not available 
on all platforms. After looking at the existing code I think 
__UINTPTR_TYPE__ is the correct substitute. I will do some more testing 
and commit an s/uintptr_t/__UINTPTR_TYPE__/ patch soon. (I will probably 
use a typedef, as seen in generic-morestack.c, to avoid uglifying the code).


Best

Thomas


Re: [PATCH v4] eliminate mutex in fast path of __register_frame

2022-09-18 Thread Thomas Neumann via Gcc-patches

Hi Dimitar,


This patch broke avr and pru-elf cross builds:
   gcc/libgcc/unwind-dw2-fde.c:680:28: error: unknown type name ‘uintptr_t’
   680 |uintptr_t *range)

Should uintptr_t be replaced with __UINTPTR_TYPE__? Such change fixes the
above broken builds for me. But I'm not sure how valid it is for that
part of libgcc.


I have fixed that by using a typedef for __UINTPTR_TYPE__.

Best

Thomas


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Thomas Koenig via Gcc-patches



On 18.09.22 11:10, Mikael Morin wrote:

Le 18/09/2022 à 08:12, Richard Biener a écrit :
On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  
wrote:


Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :


I have a concern about this part, though.  My understanding at the
time was that it is not possible to clobber an individual array
element, but that this clobbers anything off the pointer that this
is based on.


Well, we need the middle-end guys to give a definitive answer on this
topic, but I think it would be a very penalizing limitation if that was
the case.  I have assumed that the clobber spanned the value it was
applied on, neither more nor less, so just the array element in case of
array elements.


There is IL verification that the LHS of a CLOBBER is either
a declaration or a pointer dereference, no array or component
selection is allowed there.  Now, nothing should go wrong here,
but we might eventually just drop those CLOBBERs or ICE if
we frontend hands us an "invalid" one.



Obviously I have assumed too much here; it's probably best to drop this 
patch.


Probably, yes.

It is unfortunate as there is some desirable behavior within reach here. 


I think some of the desired behavior can still be salvaged.  For
example, for

  subroutine foo(a,n)
integer :: n
integer, dimension(n), intent(in) :: n

...

  subroutine bar(a)
integer, intent(out) :: a

...

  integer :: a(3)

  call foo(a,3)
  call foo(a(1),3)

clobbers for the whole array can still be generated, but not for

  call foo(a(2),2)

so one would have to look at the lower bound.

For this case, it would be helpful to clobber a range a(2:), but that
is a wishlist item for the future ;-)

What is unsafe, currently, is

  call bar(a(1))

Best regards

Thomas


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Richard Biener via Gcc-patches



> Am 18.09.2022 um 11:10 schrieb Mikael Morin :
> 
> Le 18/09/2022 à 08:12, Richard Biener a écrit :
>>> On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  wrote:
>>> 
>>> Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :
 
 I have a concern about this part, though.  My understanding at the
 time was that it is not possible to clobber an individual array
 element, but that this clobbers anything off the pointer that this
 is based on.
 
>>> Well, we need the middle-end guys to give a definitive answer on this
>>> topic, but I think it would be a very penalizing limitation if that was
>>> the case.  I have assumed that the clobber spanned the value it was
>>> applied on, neither more nor less, so just the array element in case of
>>> array elements.
>> There is IL verification that the LHS of a CLOBBER is either
>> a declaration or a pointer dereference, no array or component
>> selection is allowed there.  Now, nothing should go wrong here,
>> but we might eventually just drop those CLOBBERs or ICE if
>> we frontend hands us an "invalid" one.
> 
> Obviously I have assumed too much here; it's probably best to drop this patch.
> It is unfortunate as there is some desirable behavior within reach here.  The 
> test shows that the patch permits the elimination of a useless store.  And IL 
> verification doesn't seem that upset with it by the way.
> Does *(&a[1]) count as a pointer dereference?

Yes, technically.

>  Even in the original dump it is already simplified to a straight a[1].

But this not anymore.  The check can probably be relaxed, it stems from the 
dual purpose of CLOBBER.

Richard 

[patch, avr] Fix PR target/99184: Wrong cast from double to 16-bit and 32-bit ints.

2022-09-18 Thread Georg Johann Lay

Hello,

this patch fixed PR target/99184 which incorrectly rounded during 64-bit 
(long) double to 16-bit and 32-bit integers.


The patch just removes the respective roundings from 
libf7-asm.sx::to_integer and ::to_unsigned.  Luckily, LibF7 does nowhere 
use respective functions internally, the only user is in libf7.c::f7_exp


which reads

  f7_round (qq, qq);
  int16_t q = f7_get_s16 (qq);

so that f7_get_s16() operates on an already rounded value, and therefore 
this code works unaltered with or without rounding in to_integer.


The patch applies to directory

./libgcc/config/avr/libf7/

and is the same for all GCC versions v10+.

Please someone with write permissions commit it to trunk and backport to 
v12, v11, and v10 as it is a wrong-code issue.


The patch will fit without problems (except for ChangeLog) because there 
is no traffic on that folder.



Thanks, Johann


libgcc/config/avr/libf7/
PR target/99184
Remove rounding from double to [u]int16 and [u]int32 casts.

* libf7-asm.sx (to_integer, to_unsigned): Don't round 16-bit
and 32-bit integers.

diff --git a/ChangeLog b/ChangeLog
index 7e06f52..3ec0082 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+
+	PR target/99184
+	Remove rounding from double to [u]int16 and [u]int32 casts.
+
+	* libf7-asm.sx (to_integer, to_unsigned): Don't round 16-bit
+	and 32-bit integers.
+
 2022-04-21  Release Manager
 
 	* GCC 11.3.0 released.
diff --git a/libf7-asm.sx b/libf7-asm.sx
index 7629e23..9d701f2 100644
--- a/libf7-asm.sx
+++ b/libf7-asm.sx
@@ -601,9 +601,6 @@ DEFUN to_integer
 tst C6
 brmi.Lsaturate.T;   > INTxx_MAX  =>  saturate
 
-rcall   .Lround
-brmi.Lsaturate.T;   > INTxx_MAX  =>  saturate
-
 brtc 9f ;   >= 0 =>  return
 sbrcMask,   5
 .global __negdi2
@@ -658,30 +655,6 @@ DEFUN to_integer
 .global __clr_8
 XJMP__clr_8
 
-.Lround:
-;; C6.7 is known to be 0 here.
-;; Return N = 1 iff we have to saturate.
-cpi Mask,   0xf
-breq .Lround16
-cpi Mask,   0x1f
-breq .Lround32
-
-;; For now, no rounding in the 64-bit case.  This rounding
-;; would have to be integrated into the right-shift.
-cln
-ret
-
-.Lround32:
-rol C2
-adc C3, ZERO
-adc C4, ZERO
-rjmp 2f
-
-.Lround16:
-rol C4
-2:  adc C5, ZERO
-adc C6, ZERO
-ret
 ENDF to_integer
 #endif /* F7MOD_to_integer_ */
 
@@ -725,29 +698,6 @@ DEFUN to_unsigned
 clr CA
 F7call  lshrdi3
 POP r16
-
-;; Rounding
-;; ??? C6.7 is known to be 0 here.
-cpi Mask,   0xf
-breq .Lround16
-cpi Mask,   0x1f
-breq .Lround32
-
-;; For now, no rounding in the 64-bit case.  This rounding
-;; would have to be integrated into the right-shift.
-ret
-
-.Lround32:
-rol C2
-adc C3, ZERO
-adc C4, ZERO
-rjmp 2f
-
-.Lround16:
-rol C4
-2:  adc C5, ZERO
-adc C6, ZERO
-brcs.Lset_0x; Rounding overflow  =>  saturate
 ret
 
 .Lset_0x:


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Harald Anlauf via Gcc-patches

On 18.09.22 12:23, Thomas Koenig via Gcc-patches wrote:


On 18.09.22 11:10, Mikael Morin wrote:

It is unfortunate as there is some desirable behavior within reach here.


I think some of the desired behavior can still be salvaged.  For
example, for

   subroutine foo(a,n)
     integer :: n
     integer, dimension(n), intent(in) :: n


integer, dimension(n), intent(out) :: a  ! ?


...

   subroutine bar(a)
     integer, intent(out) :: a

...

   integer :: a(3)

   call foo(a,3)
   call foo(a(1),3)

clobbers for the whole array can still be generated, but not for

   call foo(a(2),2)

so one would have to look at the lower bound.

For this case, it would be helpful to clobber a range a(2:), but that
is a wishlist item for the future ;-)

What is unsafe, currently, is

   call bar(a(1))


We'll need a good coverage by testcases for the different handling
required for assumed shape / assumed size / explicit size dummies
to avoid new regressions.  Assumed shape will be on the easy side,
while assumed size likely needs to be excluded for clobbering.

Harald



Re: [PATCH v3 05/11] OpenMP: push attaches to end of clause list in "target" regions

2022-09-18 Thread Julian Brown
On Wed, 14 Sep 2022 14:44:54 +0200
Jakub Jelinek  wrote:

> On Tue, Sep 13, 2022 at 02:03:15PM -0700, Julian Brown wrote:
> > This patch moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes
> > to the end of the clause list, for offload regions.  This ensures
> > that when we do the attach operation, both the "attachment point"
> > and the target region have both already been mapped on the target.
> > This avoids a pathological case that can otherwise happen with
> > struct sibling-list handling.
> > 
> > 2022-09-13  Julian Brown  
> > 
> > gcc/
> > * gimplify.cc (omp_segregate_mapping_groups): Update
> > comment. (omp_push_attaches_to_end): New function.
> > (gimplify_scan_omp_clauses): Use omp_push_attaches_to_end
> > for offloaded regions.  
> 
> Shouldn't this be done at the end of gimplify_adjust_omp_clauses?
> I mean, can't further attach clauses appear because of declare mapper
> for implicitly mapped variables?
> Other than that, it is yet another walk of the whole clause list, so
> would be nicer if it could be done in an existing walk over the
> clauses or at least have a flag whether there are any such clauses
> present and do it only in that case.
> If it could be done in the main gimplify_adjust_omp_clauses loop,
> nice, if it can appear also during
>   splay_tree_foreach (ctx->variables, gimplify_adjust_omp_clauses_1,
> &data); that isn't the case.

I don't think any ATTACH clauses can appear during
the gimplify_adjust_omp_clause_1 walk. So, how about this?

Thanks,

Julian
>From d583f3315ac8cda58bbfce8c8574d0adc5283b00 Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Wed, 7 Sep 2022 21:45:07 +
Subject: [PATCH 1/2] OpenMP: Push attaches to end of clause list in "target"
 regions

This patch moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes to
the end of the clause list, for offload regions.  This ensures that when
we do the attach operation, both the "attachment point" and the target
region have both already been mapped on the target.  This avoids a
pathological case that can otherwise happen with struct sibling-list
handling.

This version of the patch moves the attach-node motion to
gimplify_adjust_omp_clauses.

2022-09-15  Julian Brown  

gcc/
* gimplify.cc (omp_segregate_mapping_groups): Update comment.
(gimplify_adjust_omp_clauses): Move ATTACH and
	ATTACH_ZERO_LENGTH_ARRAY_SECTION nodes to the end of the clause list
	for offloaded OpenMP regions.

gcc/testsuite/
	* g++.dg/gomp/target-lambda-1.C: Adjust expected scan output.
---
 gcc/gimplify.cc | 37 -
 gcc/testsuite/g++.dg/gomp/target-lambda-1.C |  2 +-
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 2ae0c8cb250..4d032c6bf06 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9639,7 +9639,9 @@ omp_tsort_mapping_groups (vec *groups,
 /* Split INLIST into two parts, moving groups corresponding to
ALLOC/RELEASE/DELETE mappings to one list, and other mappings to another.
The former list is then appended to the latter.  Each sub-list retains the
-   order of the original list.  */
+   order of the original list.
+   Note that ATTACH nodes are later moved to the end of the list in
+   gimplify_adjust_omp_clauses, for target regions.  */
 
 static omp_mapping_group *
 omp_segregate_mapping_groups (omp_mapping_group *inlist)
@@ -12411,10 +12413,15 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	*list_p = c2;
 	  }
 }
+
+  tree attach_list = NULL_TREE;
+  tree *attach_tail = &attach_list;
+
   while ((c = *list_p) != NULL)
 {
   splay_tree_node n;
   bool remove = false;
+  bool move_attach = false;
 
   switch (OMP_CLAUSE_CODE (c))
 	{
@@ -12576,6 +12583,19 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 	  remove = true;
 	  break;
 	}
+	  /* If we have a target region, we can push all the attaches to the
+	 end of the list (we may have standalone "attach" operations
+	 synthesized for GOMP_MAP_STRUCT nodes that must be processed after
+	 the attachment point AND the pointed-to block have been mapped).
+	 If we have something else, e.g. "enter data", we need to keep
+	 "attach" nodes together with the previous node they attach to so
+	 that separate "exit data" operations work properly (see
+	 libgomp/target.c).  */
+	  if ((ctx->region_type & ORT_TARGET) != 0
+	  && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		  || (OMP_CLAUSE_MAP_KIND (c)
+		  == GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)))
+	move_attach = true;
 	  decl = OMP_CLAUSE_DECL (c);
 	  /* Data clauses associated with reductions must be
 	 compatible with present_or_copy.  Warn and adjust the clause
@@ -12890,10 +12910,25 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p,
 
   if (remove)
 	*list_p = OMP_CLAUSE_CHAIN (c);
+  else if (mo

Re: [PATCH v3 05/11] OpenMP: push attaches to end of clause list in "target" regions

2022-09-18 Thread Jakub Jelinek via Gcc-patches
On Sun, Sep 18, 2022 at 08:10:00PM +0100, Julian Brown wrote:
> I don't think any ATTACH clauses can appear during
> the gimplify_adjust_omp_clause_1 walk. So, how about this?

LGTM, thanks.

Jakub



Re: [PATCH v3 06/11] OpenMP: Pointers and member mappings

2022-09-18 Thread Julian Brown
On Wed, 14 Sep 2022 14:53:54 +0200
Jakub Jelinek  wrote:

> On Tue, Sep 13, 2022 at 02:03:16PM -0700, Julian Brown wrote:
> > @@ -3440,6 +3437,50 @@ gfc_trans_omp_clauses (stmtblock_t *block,
> > gfc_omp_clauses *clauses, {
> >   if (pointer || (openacc && allocatable))
> > {
> > + gfc_omp_namelist *n2
> > +   = openacc ? NULL :
> > clauses->lists[OMP_LIST_MAP]; +
> > + /* If the last reference is a pointer to
> > a derived
> > +type ("foo%dt_ptr"), check if any
> > subcomponents
> > +of the same derived type member are
> > being mapped
> > +elsewhere in the clause list
> > ("foo%dt_ptr%x",
> > +etc.).  If we have such subcomponent
> > mappings,
> > +we only create an ALLOC node for the
> > pointer
> > +itself, and inhibit mapping the whole
> > derived
> > +type.  */
> > +
> > + for (; n2 != NULL; n2 = n2->next)
> > +   {
> > + if (n == n2 || !n2->expr)
> > +   continue;
> > +
> > + int dep
> > +   = gfc_dep_resolver (n->expr->ref,
> > n2->expr->ref,
> > +   NULL, true);
> > + if (dep == 0)
> > +   continue;  
> 
> Isn't this and the other loop quadratic compile time in number of
> clauses? Could it be done linearly through some perhaps lazily built
> hash table or something similar?

How about this -- a hash table is used to split up the list by the root
symbol the first time we try to resolve dependencies. This arguably
doesn't get rid of the potential for quadratic behaviour *completely*,
but I think the input where it would still be troublesome would be
rather pathological.

I'm not sure how to do better without reimplementing gfc_dep_resolver
or implementing general hashing for Fortran FE expression/symbol nodes,
which isn't there at the moment, so far as I can tell.

I noticed during testing that one of the new added tests for this patch
doesn't actually work until the next patch is in, i.e.:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601558.html

So possibly the patches should be committed squashed together. I
XFAILed the test for now.

Re-tested with offloading to NVPTX. OK?

Thanks,

Julian
>From f90db664b6591e4a6525de4eb9ec15ad18d99688 Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Tue, 31 May 2022 18:39:00 +
Subject: [PATCH 2/2] OpenMP: Pointers and member mappings

Implementing the "omp declare mapper" functionality, I noticed some
cases where handling of derived type members that are pointers doesn't
seem to be quite right. At present, a type such as this:

  type T
  integer, pointer, dimension(:) :: arrptr
  end type T

  type(T) :: tvar
  [...]
  !$omp target map(tofrom: tvar%arrptr)

will be mapped using three mapping nodes:

  GOMP_MAP_TO tvar%arrptr   (the descriptor)
  GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data  (a pointer to the array data)

This follows OMP 5.0, 2.19.7.1 "map Clause":

  "If a list item in a map clause is an associated pointer and the
   pointer is not the base pointer of another list item in a map clause
   on the same construct, then it is treated as if its pointer target
   is implicitly mapped in the same clause. For the purposes of the map
   clause, the mapped pointer target is treated as if its base pointer
   is the associated pointer."

However, we can also write this:

  map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))

and then instead we should follow:

  "If the structure sibling list item is a pointer then it is treated
   as if its association status is undefined, unless it appears as
   the base pointer of another list item in a map clause on the same
   construct."

But, that's not implemented quite right at the moment (and completely
breaks once we introduce declare mappers), because we still map the "to:
tvar%arrptr" as the descriptor and the entire array, then we map the
"tvar%arrptr(3:8)" part using the descriptor (again!) and the array slice.

The solution is to detect when we're mapping a smaller part of the array
(or a subcomponent) on the same directive, and only map the descriptor
in that case. So we get mappings like this instead:

  map(to: tvar%arrptr)   -->
  GOMP_MAP_ALLOC  tvar%arrptr  (the descriptor)

  map(tofrom: tvar%arrptr(3:8)   -->
  GOMP_MAP_TOFROM tvar%arrptr%data(3) (size 8-3+1, etc.)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (bias 3, etc.)

This version of the patch builds a hash table separating candidate
clauses for dependency checking by root symbol, to alleviate potential
quadratic behaviour.

2022-09-15  Julian Brown  

gcc/fortr

[PATCH v2 1/9] fortran: Move the clobber generation code

2022-09-18 Thread Mikael Morin via Gcc-patches
This change inlines the clobber generation code from
gfc_conv_expr_reference to the single caller from where the add_clobber
flag can be true, and removes the add_clobber argument.

What motivates this is the standard making the procedure call a cause
for a variable to become undefined, which translates to a clobber
generation, so clobber generation should be closely related to procedure
call generation, whereas it is rather orthogonal to variable reference
generation.  Thus the generation of the clobber feels more appropriate
in gfc_conv_procedure_call than in gfc_conv_expr_reference.

Behaviour remains unchanged.

gcc/fortran/ChangeLog:

* trans.h (gfc_conv_expr_reference): Remove add_clobber
argument.
* trans-expr.cc (gfc_conv_expr_reference): Ditto. Inline code
depending on add_clobber and conditions controlling it ...
(gfc_conv_procedure_call): ... to here.
---
 gcc/fortran/trans-expr.cc | 58 +--
 gcc/fortran/trans.h   |  3 +-
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 850007fd2e1..7902b941c2d 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6395,7 +6395,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
&& e->symtree->n.sym->attr.pointer))
&& fsym && fsym->attr.target)
/* Make sure the function only gets called once.  */
-   gfc_conv_expr_reference (&parmse, e, false);
+   gfc_conv_expr_reference (&parmse, e);
  else if (e->expr_type == EXPR_FUNCTION
   && e->symtree->n.sym->result
   && e->symtree->n.sym->result != e->symtree->n.sym
@@ -6502,22 +6502,36 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
}
  else
{
- bool add_clobber;
- add_clobber = fsym && fsym->attr.intent == INTENT_OUT
-   && !fsym->attr.allocatable && !fsym->attr.pointer
-   && e->symtree && e->symtree->n.sym
-   && !e->symtree->n.sym->attr.dimension
-   && !e->symtree->n.sym->attr.pointer
-   && !e->symtree->n.sym->attr.allocatable
-   /* See PR 41453.  */
-   && !e->symtree->n.sym->attr.dummy
-   /* FIXME - PR 87395 and PR 41453  */
-   && e->symtree->n.sym->attr.save == SAVE_NONE
-   && !e->symtree->n.sym->attr.associate_var
-   && e->ts.type != BT_CHARACTER && e->ts.type != 
BT_DERIVED
-   && e->ts.type != BT_CLASS && !sym->attr.elemental;
+ gfc_conv_expr_reference (&parmse, e);
 
- gfc_conv_expr_reference (&parmse, e, add_clobber);
+ if (fsym
+ && fsym->attr.intent == INTENT_OUT
+ && !fsym->attr.allocatable
+ && !fsym->attr.pointer
+ && e->expr_type == EXPR_VARIABLE
+ && e->ref == NULL
+ && e->symtree
+ && e->symtree->n.sym
+ && !e->symtree->n.sym->attr.dimension
+ && !e->symtree->n.sym->attr.pointer
+ && !e->symtree->n.sym->attr.allocatable
+ /* See PR 41453.  */
+ && !e->symtree->n.sym->attr.dummy
+ /* FIXME - PR 87395 and PR 41453  */
+ && e->symtree->n.sym->attr.save == SAVE_NONE
+ && !e->symtree->n.sym->attr.associate_var
+ && e->ts.type != BT_CHARACTER
+ && e->ts.type != BT_DERIVED
+ && e->ts.type != BT_CLASS
+ && !sym->attr.elemental)
+   {
+ tree var;
+ /* FIXME: This fails if var is passed by reference, 
see PR
+41453.  */
+ var = e->symtree->n.sym->backend_decl;
+ tree clobber = build_clobber (TREE_TYPE (var));
+ gfc_add_modify (&se->pre, var, clobber);
+   }
}
  /* Catch base objects that are not variables.  */
  if (e->ts.type == BT_CLASS
@@ -9485,7 +9499,7 @@ gfc_conv_expr_type (gfc_se * se, gfc_expr * expr, tree 
type)
values only.  */
 
 void
-gfc_conv_expr_reference (gfc_se * se, gfc_expr * expr, bool add_clobber)
+gfc_conv_expr_reference (gfc_se * se, gfc_expr * expr)
 {
   gfc_ss *ss;
   tree var;
@@ -9525,16 +9539,6 @@ gfc_conv_expr_reference (gfc_se * se, gfc_expr * expr, 
bool add

[PATCH v2 7/9] fortran: Support clobbering of ASSOCIATE variables [PR87397]

2022-09-18 Thread Mikael Morin via Gcc-patches
This is in spirit a revert of:
r9-3051-gc109362313623d83fe0a5194bceaf994cf0c6ce0

That commit added a condition to avoid generating ICE with clobbers
of ASSOCIATE variables.
The test added at that point continues to pass if we remove that
condition now.

PR fortran/87397
PR fortran/41453

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Remove condition
disabling clobber generation for ASSOCIATE variables.
---
 gcc/fortran/trans-expr.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index d169df44a71..4491465c4d6 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6527,7 +6527,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && !e->symtree->n.sym->attr.dimension
  && !e->symtree->n.sym->attr.pointer
  && !e->symtree->n.sym->attr.allocatable
- && !e->symtree->n.sym->attr.associate_var
  && e->ts.type != BT_CHARACTER
  && e->ts.type != BT_DERIVED
  && e->ts.type != BT_CLASS
-- 
2.35.1



[PATCH v2 5/9] fortran: Support clobbering of reference variables [PR41453]

2022-09-18 Thread Mikael Morin via Gcc-patches
This adds support for clobbering of variables passed by reference,
when the reference is forwarded to a subroutine as actual argument
whose associated dummy has the INTENT(OUT) attribute.
This was explicitly disabled and enabling it seems to work, as
demonstrated by the new testcase.

PR fortran/41453

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Remove condition
disabling clobber generation for dummy variables.  Remove
obsolete comment.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_6.f90: New test.
---
 gcc/fortran/trans-expr.cc |  4 ---
 .../gfortran.dg/intent_optimize_6.f90 | 34 +++
 2 files changed, 34 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_6.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 2301724729f..9b2832bdb26 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6527,8 +6527,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && !e->symtree->n.sym->attr.dimension
  && !e->symtree->n.sym->attr.pointer
  && !e->symtree->n.sym->attr.allocatable
- /* See PR 41453.  */
- && !e->symtree->n.sym->attr.dummy
  /* FIXME - PR 87395 and PR 41453  */
  && e->symtree->n.sym->attr.save == SAVE_NONE
  && !e->symtree->n.sym->attr.associate_var
@@ -6538,8 +6536,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && !sym->attr.elemental)
{
  tree var;
- /* FIXME: This fails if var is passed by reference, 
see PR
-41453.  */
  var = build_fold_indirect_ref_loc (input_location,
 parmse.expr);
  tree clobber = build_clobber (TREE_TYPE (var));
diff --git a/gcc/testsuite/gfortran.dg/intent_optimize_6.f90 
b/gcc/testsuite/gfortran.dg/intent_optimize_6.f90
new file mode 100644
index 000..72fec3db583
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_optimize_6.f90
@@ -0,0 +1,34 @@
+! { dg-do run }
+! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized 
-fdump-tree-original" }
+!
+! PR fortran/41453
+! Check that the INTENT(OUT) attribute causes one clobber to be emitted in
+! the caller before each call to FOO in the *.original dump, and the
+! initialization constant to be optimized away in the *.optimized dump,
+! in the case of an argument passed by reference to the caller.
+
+module x
+implicit none
+contains
+  subroutine foo(a)
+integer(kind=4), intent(out) :: a
+a = 42
+  end subroutine foo
+  subroutine bar(b)
+integer(kind=4) :: b
+b = 123456789
+call foo(b)
+  end subroutine bar
+end module x
+
+program main
+  use x
+  implicit none
+  integer(kind=4) :: c
+  call bar(c)
+  if (c /= 42) stop 1
+end program main
+
+! { dg-final { scan-tree-dump-times "CLOBBER" 1 "original" } }
+! { dg-final { scan-tree-dump "\\*\\\(integer\\\(kind=4\\\) \\*\\\) b = 
{CLOBBER};" "original" } }
+! { dg-final { scan-tree-dump-not "123456789" "optimized" { target 
__OPTIMIZE__ } } }
-- 
2.35.1



[PATCH v2 0/9] fortran: clobber fixes [PR41453]

2022-09-18 Thread Mikael Morin via Gcc-patches
Hello,

this is the second version of a set of changes around the clobber we
generate in the caller before a procedure call, for each actual argument
whose associated dummy has the INTENT(OUT) attribute.

The clobbering of partial variables (array elements, structure components) 
proved to be unsupported by the middle-end optimizers, even if it seemed
to work in practice.  So this version just removes it.

v1 of the series was posted here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601713.html
https://gcc.gnu.org/pipermail/fortran/2022-September/058165.html

Changes from v1:
 - patch 9/10 (clobber subreferences) has been dropped.
 - patch 10/10 (now 9/9): The test has been adjusted because some checks
   were failing without the dropped patch.  Basically there are less
   clobbers generated.
 - patch 5: In the test, an explicit kind has been added to integers,
   so that the dump match is not dependent on the
   -fdefault-integer-8 option.
 - patches 3, 4, 5, 8, and 10/10 (now 9/9): The tests have been renamed
   so that they are numbered in increasing order.

The first patch is a refactoring moving the clobber generation in
gfc_conv_procedure_call where it feels more appropriate.
The second patch is a fix for the ICE originally motivating my work
on this topic.
The third patch is a fix for some wrong code issue discovered with an
earlier version of this series.
The following patches are gradual condition loosenings to enable clobber 
generation in more and more cases.

Each patch has been tested through an incremental bootstrap and a
partial testsuite run on fortran *intent* tests, and the whole lot has
been run through the full fortran regression testsuite.
OK for master?


Harald Anlauf (1):
  fortran: Support clobbering with implicit interfaces [PR105012]

Mikael Morin (8):
  fortran: Move the clobber generation code
  fortran: Fix invalid function decl clobber ICE [PR105012]
  fortran: Move clobbers after evaluation of all arguments [PR106817]
  fortran: Support clobbering of reference variables [PR41453]
  fortran: Support clobbering of SAVE variables [PR87395]
  fortran: Support clobbering of ASSOCIATE variables [PR87397]
  fortran: Support clobbering of allocatables and pointers [PR41453]
  fortran: Support clobbering of derived types [PR41453]

 gcc/fortran/trans-expr.cc | 81 ---
 gcc/fortran/trans.h   |  3 +-
 .../gfortran.dg/intent_optimize_4.f90 | 43 ++
 .../gfortran.dg/intent_optimize_5.f90 | 24 ++
 .../gfortran.dg/intent_optimize_6.f90 | 34 
 .../gfortran.dg/intent_optimize_7.f90 | 42 ++
 .../gfortran.dg/intent_optimize_8.f90 | 66 +++
 gcc/testsuite/gfortran.dg/intent_out_15.f90   | 27 +++
 8 files changed, 290 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_4.f90
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_5.f90
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_6.f90
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_7.f90
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_8.f90
 create mode 100644 gcc/testsuite/gfortran.dg/intent_out_15.f90

-- 
2.35.1



[PATCH v2 2/9] fortran: Fix invalid function decl clobber ICE [PR105012]

2022-09-18 Thread Mikael Morin via Gcc-patches
The fortran frontend, as result symbol for a function without
declared result symbol, uses the function symbol itself.  This caused
an invalid clobber of a function decl to be emitted, leading to an
ICE, whereas the intended behaviour was to clobber the function result
variable.  This change fixes the problem by getting the decl from the
just-retrieved variable reference after the call to
gfc_conv_expr_reference, instead of copying it from the frontend symbol.

PR fortran/105012

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Retrieve variable
from the just calculated variable reference.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_out_15.f90: New test.
---
 gcc/fortran/trans-expr.cc   |  3 ++-
 gcc/testsuite/gfortran.dg/intent_out_15.f90 | 27 +
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_out_15.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 7902b941c2d..76c587e3d9f 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6528,7 +6528,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  tree var;
  /* FIXME: This fails if var is passed by reference, 
see PR
 41453.  */
- var = e->symtree->n.sym->backend_decl;
+ var = build_fold_indirect_ref_loc (input_location,
+parmse.expr);
  tree clobber = build_clobber (TREE_TYPE (var));
  gfc_add_modify (&se->pre, var, clobber);
}
diff --git a/gcc/testsuite/gfortran.dg/intent_out_15.f90 
b/gcc/testsuite/gfortran.dg/intent_out_15.f90
new file mode 100644
index 000..64334e6f038
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_out_15.f90
@@ -0,0 +1,27 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/105012
+! The following case was triggering an ICE because of a clobber
+! on the DERFC function decl instead of its result.
+
+module error_function
+integer, parameter :: r8 = selected_real_kind(12) ! 8 byte real
+contains
+SUBROUTINE CALERF_r8(ARG, RESULT, JINT)
+   integer, parameter :: rk = r8
+   real(rk), intent(in)  :: arg
+   real(rk), intent(out) :: result
+   IF (Y .LE. THRESH) THEN
+   END IF
+end SUBROUTINE CALERF_r8
+FUNCTION DERFC(X)
+   integer, parameter :: rk = r8 ! 8 byte real
+   real(rk), intent(in) :: X
+   real(rk) :: DERFC
+   CALL CALERF_r8(X, DERFC, JINT)
+END FUNCTION DERFC
+end module error_function
+
+! { dg-final { scan-tree-dump-times "CLOBBER" 1 "original" } }
+! { dg-final { scan-tree-dump "__result_derfc = {CLOBBER};" "original" } }
-- 
2.35.1



[PATCH v2 9/9] fortran: Support clobbering of derived types [PR41453]

2022-09-18 Thread Mikael Morin via Gcc-patches
This is probably the most risky patch in the series.

A previous version of this patch allowing all exactly matching derived
types showed two regressions.  One of them uncovered PR106817 for which
I added a fix in this series, and for the other I have excluded
types with allocatable components from clobbering.

I have additionnally excluded finalizable types for similar reasons, and
parameterized derived type because they may not be constant-sized.

I hope we are safe for all the rest.

-- >8 --

This adds support for clobbering of non-polymorphic derived type
variables, when they are passed as actual argument whose associated
dummy has the INTENT(OUT) attribute.

We avoid to play with non-constant type sizes or class descriptors by
requiring that the types are derived (not class) and strictly matching,
and by excluding parameterized derived types.

Types that are used in the callee are also excluded: they are types with
allocatable components (the components will be deallocated), and
finalizable types or types with finalizable components (they will be
passed to finalization routines).

PR fortran/41453

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Allow strictly
matching derived types.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_8.f90: New test.
---
 gcc/fortran/trans-expr.cc | 18 -
 .../gfortran.dg/intent_optimize_8.f90 | 66 +++
 2 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_8.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index ae685157e22..16f14554db3 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6526,8 +6526,24 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && e->symtree->n.sym
  && !e->symtree->n.sym->attr.dimension
  && e->ts.type != BT_CHARACTER
- && e->ts.type != BT_DERIVED
  && e->ts.type != BT_CLASS
+ && (e->ts.type != BT_DERIVED
+ || (dsym->ts.type == BT_DERIVED
+ && e->ts.u.derived == dsym->ts.u.derived
+ /* Types with allocatable components are
+excluded from clobbering because we need
+the unclobbered pointers to free the
+allocatable components in the callee.
+Same goes for finalizable types or types
+with finalizable components, we need to
+pass the unclobbered values to the
+finalization routines.
+For parameterized types, it's less clear
+but they may not have a constant size
+so better exclude them in any case.  */
+ && !e->ts.u.derived->attr.alloc_comp
+ && !e->ts.u.derived->attr.pdt_type
+ && !gfc_is_finalizable (e->ts.u.derived, 
NULL)))
  && !sym->attr.elemental)
{
  tree var;
diff --git a/gcc/testsuite/gfortran.dg/intent_optimize_8.f90 
b/gcc/testsuite/gfortran.dg/intent_optimize_8.f90
new file mode 100644
index 000..d8bc1bb3b7b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_optimize_8.f90
@@ -0,0 +1,66 @@
+! { dg-do run }
+! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized 
-fdump-tree-original" }
+!
+! PR fortran/41453
+! Check that the INTENT(OUT) attribute causes in the case of non-polymorphic 
derived type arguments:
+!  - one clobber to be emitted in the caller before calls to FOO in the 
*.original dump,
+!  - no clobber to be emitted in the caller before calls to BAR in the 
*.original dump,
+!  - the initialization constants to be optimized away in the *.optimized dump.
+
+module x
+  implicit none
+  type :: t
+integer :: c
+  end type t
+  type, extends(t) :: u
+integer :: d
+  end type u
+contains
+  subroutine foo(a)
+type(t), intent(out) :: a
+a = t(42)
+  end subroutine foo
+  subroutine bar(b)
+class(t), intent(out) :: b
+b%c = 24
+  end subroutine bar
+end module x
+
+program main
+  use x
+  implicit none
+  type(t) :: tc
+  type(u) :: uc, ud
+  class(t), allocatable :: te, tf
+
+  tc = t(123456789)
+  call foo(tc)
+  if (tc%c /= 42) stop 1
+
+  uc = u(987654321, 0)
+  call foo(uc%t)
+  if (uc%c /= 42) stop 2
+  if (uc%d /= 0) stop 3
+
+  ud = u(11223344, 0)
+  call bar(ud)
+  if (ud%c /= 24) stop 4
+
+  te = t(55667788)
+  call foo(te)
+  if (te%c /= 42) stop 5
+
+  tf = t(99887766)
+  call bar(tf)
+  if (tf

[PATCH v2 8/9] fortran: Support clobbering of allocatables and pointers [PR41453]

2022-09-18 Thread Mikael Morin via Gcc-patches
This adds support for clobbering of allocatable and pointer scalar
variables passed as actual argument to a subroutine when the associated
dummy has the INTENT(OUT) attribute.
Support was explicitly disabled, but the clobber generation code seems
to support it well, as demonstrated by the newly added testcase.

PR fortran/41453

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Remove conditions
on ALLOCATABLE and POINTER attributes guarding clobber
generation.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_7.f90: New test.
---
 gcc/fortran/trans-expr.cc |  2 -
 .../gfortran.dg/intent_optimize_7.f90 | 42 +++
 2 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_7.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 4491465c4d6..ae685157e22 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6525,8 +6525,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && e->symtree
  && e->symtree->n.sym
  && !e->symtree->n.sym->attr.dimension
- && !e->symtree->n.sym->attr.pointer
- && !e->symtree->n.sym->attr.allocatable
  && e->ts.type != BT_CHARACTER
  && e->ts.type != BT_DERIVED
  && e->ts.type != BT_CLASS
diff --git a/gcc/testsuite/gfortran.dg/intent_optimize_7.f90 
b/gcc/testsuite/gfortran.dg/intent_optimize_7.f90
new file mode 100644
index 000..0146dff4e20
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_optimize_7.f90
@@ -0,0 +1,42 @@
+! { dg-do run }
+! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized 
-fdump-tree-original" }
+!
+! PR fortran/41453
+! Check that the INTENT(OUT) attribute causes one clobber to be emitted in
+! the caller before each call to FOO in the *.original dump, and the
+! initialization constants to be optimized away in the *.optimized dump,
+! in the case of scalar allocatables and pointers.
+
+module x
+implicit none
+contains
+  subroutine foo(a)
+integer, intent(out) :: a
+a = 42
+  end subroutine foo
+end module x
+
+program main
+  use x
+  implicit none
+  integer, allocatable :: ca
+  integer, target :: ct
+  integer, pointer :: cp
+
+  allocate(ca)
+  ca = 123456789
+  call foo(ca)
+  if (ca /= 42) stop 1
+  deallocate(ca)
+
+  ct = 987654321
+  cp => ct
+  call foo(cp)
+  if (ct /= 42) stop 2
+end program main
+
+! { dg-final { scan-tree-dump-times "CLOBBER" 2 "original" } }
+! { dg-final { scan-tree-dump "\\*ca = {CLOBBER};" "original" } }
+! { dg-final { scan-tree-dump "\\*cp = {CLOBBER};" "original" } }
+! { dg-final { scan-tree-dump-not "123456789" "optimized" { target 
__OPTIMIZE__ } } }
+! { dg-final { scan-tree-dump-not "987654321" "optimized" { target 
__OPTIMIZE__ } } }
-- 
2.35.1



[PATCH v2 3/9] fortran: Move clobbers after evaluation of all arguments [PR106817]

2022-09-18 Thread Mikael Morin via Gcc-patches
For actual arguments whose dummy is INTENT(OUT), we used to generate
clobbers on them at the same time we generated the argument reference
for the function call.  This was wrong if for an argument coming
later, the value expression was depending on the value of the just-
clobbered argument, and we passed an undefined value in that case.

With this change, clobbers are collected separatedly and appended
to the procedure call preliminary code after all the arguments have been
evaluated.

PR fortran/106817

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Collect all clobbers
to their own separate block.  Append the block of clobbers to
the procedure preliminary block after the argument evaluation
codes for all the arguments.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_4.f90: New test.
---
 gcc/fortran/trans-expr.cc |  6 ++-
 .../gfortran.dg/intent_optimize_4.f90 | 43 +++
 2 files changed, 47 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_4.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 76c587e3d9f..a62a3bb642d 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6018,7 +6018,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   gfc_charlen cl;
   gfc_expr *e;
   gfc_symbol *fsym;
-  stmtblock_t post;
   enum {MISSING = 0, ELEMENTAL, SCALAR, SCALAR_POINTER, ARRAY};
   gfc_component *comp = NULL;
   int arglen;
@@ -6062,7 +6061,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   else
 info = NULL;
 
+  stmtblock_t post, clobbers;
   gfc_init_block (&post);
+  gfc_init_block (&clobbers);
   gfc_init_interface_mapping (&mapping);
   if (!comp)
 {
@@ -6531,7 +6532,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  var = build_fold_indirect_ref_loc (input_location,
 parmse.expr);
  tree clobber = build_clobber (TREE_TYPE (var));
- gfc_add_modify (&se->pre, var, clobber);
+ gfc_add_modify (&clobbers, var, clobber);
}
}
  /* Catch base objects that are not variables.  */
@@ -7400,6 +7401,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 
   vec_safe_push (arglist, parmse.expr);
 }
+  gfc_add_block_to_block (&se->pre, &clobbers);
   gfc_finish_interface_mapping (&mapping, &se->pre, &se->post);
 
   if (comp)
diff --git a/gcc/testsuite/gfortran.dg/intent_optimize_4.f90 
b/gcc/testsuite/gfortran.dg/intent_optimize_4.f90
new file mode 100644
index 000..effbaa12a2d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_optimize_4.f90
@@ -0,0 +1,43 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-final { scan-tree-dump-times "CLOBBER" 2 "original" } }
+!
+! PR fortran/106817
+! Check that for an actual argument whose dummy is INTENT(OUT),
+! the clobber that is emitted in the caller before a procedure call
+! happens after any expression depending on the argument value has been
+! evaluated.
+! 
+
+module m
+  implicit none
+contains
+  subroutine copy1(out, in)
+integer, intent(in) :: in
+integer, intent(out) :: out
+out = in
+  end subroutine copy1
+  subroutine copy2(in, out)
+integer, intent(in) :: in
+integer, intent(out) :: out
+out = in
+  end subroutine copy2
+end module m
+
+program p
+  use m
+  implicit none
+  integer :: a, b
+
+  ! Clobbering of a should happen after a+1 has been evaluated.
+  a = 3
+  call copy1(a, a+1)
+  if (a /= 4) stop 1
+
+  ! Clobbering order does not depend on the order of arguments.
+  ! It should also come last with reversed arguments.
+  b = 12
+  call copy2(b+1, b)
+  if (b /= 13) stop 2
+
+end program p
-- 
2.35.1



[PATCH v2 4/9] fortran: Support clobbering with implicit interfaces [PR105012]

2022-09-18 Thread Mikael Morin via Gcc-patches
From: Harald Anlauf 

Before procedure calls, we clobber actual arguments whose associated
dummy is INTENT(OUT).  This only applies to procedures with explicit
interfaces, as the knowledge of the interface is necessary to know
whether an argument has the INTENT(OUT) attribute.

This change also enables clobber generation for procedure calls without
explicit interface, when the procedure has been defined in the same
file because we can use the dummy arguments' characteristics from the
procedure definition in that case.

The knowledge of the dummy characteristics is directly available through
gfc_actual_arglist’s associated_dummy pointers which have been populated
as a side effect of calling gfc_check_externals.

PR fortran/105012

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Use dummy
information from associated_dummy if there is no information
from the procedure interface.

gcc/testsuite/ChangeLog:

* gfortran.dg/intent_optimize_5.f90: New test.

Co-Authored-By: Mikael Morin 
---
 gcc/fortran/trans-expr.cc | 19 +++
 .../gfortran.dg/intent_optimize_5.f90 | 24 +++
 2 files changed, 39 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/intent_optimize_5.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index a62a3bb642d..2301724729f 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6505,10 +6505,21 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
{
  gfc_conv_expr_reference (&parmse, e);
 
- if (fsym
- && fsym->attr.intent == INTENT_OUT
- && !fsym->attr.allocatable
- && !fsym->attr.pointer
+ gfc_symbol *dsym = fsym;
+ gfc_dummy_arg *dummy;
+
+ /* Use associated dummy as fallback for formal
+argument if there is no explicit interface.  */
+ if (dsym == NULL
+ && (dummy = arg->associated_dummy)
+ && dummy->intrinsicness == GFC_NON_INTRINSIC_DUMMY_ARG
+ && dummy->u.non_intrinsic->sym)
+   dsym = dummy->u.non_intrinsic->sym;
+
+ if (dsym
+ && dsym->attr.intent == INTENT_OUT
+ && !dsym->attr.allocatable
+ && !dsym->attr.pointer
  && e->expr_type == EXPR_VARIABLE
  && e->ref == NULL
  && e->symtree
diff --git a/gcc/testsuite/gfortran.dg/intent_optimize_5.f90 
b/gcc/testsuite/gfortran.dg/intent_optimize_5.f90
new file mode 100644
index 000..2f184bf84a8
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/intent_optimize_5.f90
@@ -0,0 +1,24 @@
+! { dg-do run }
+! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized 
-fdump-tree-original" }
+!
+! PR fortran/105012
+! Check that the INTENT(OUT) attribute causes one clobber to be emitted in
+! the caller before the call to Y in the *.original dump, and the
+! initialization constant to be optimized away in the *.optimized dump,
+! despite the non-explicit interface if the subroutine with the INTENT(OUT)
+! is declared in the same file.
+
+SUBROUTINE Y (Z)
+  integer, intent(out) :: Z
+  Z = 42
+END SUBROUTINE Y
+PROGRAM TEST
+integer :: X
+X = 123456789
+CALL Y (X)
+if (X.ne.42) STOP 1
+END PROGRAM
+
+! { dg-final { scan-tree-dump-times "CLOBBER" 1 "original" } }
+! { dg-final { scan-tree-dump "x = {CLOBBER};" "original" } }
+! { dg-final { scan-tree-dump-not "123456789" "optimized" { target 
__OPTIMIZE__ } } }
-- 
2.35.1



[PATCH v2 6/9] fortran: Support clobbering of SAVE variables [PR87395]

2022-09-18 Thread Mikael Morin via Gcc-patches
This is in spirit a revert of:
r9-3032-gee7fb0588c6361b4d77337ab0f7527be64fcdde2

That commit added a condition to avoid generating ICE with clobbers
of variables with the SAVE attribute.
The test added at that point continues to pass if we remove that
condition now.

PR fortran/87395
PR fortran/41453

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Remove condition
on SAVE attribute guarding clobber generation.
---
 gcc/fortran/trans-expr.cc | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 9b2832bdb26..d169df44a71 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6527,8 +6527,6 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  && !e->symtree->n.sym->attr.dimension
  && !e->symtree->n.sym->attr.pointer
  && !e->symtree->n.sym->attr.allocatable
- /* FIXME - PR 87395 and PR 41453  */
- && e->symtree->n.sym->attr.save == SAVE_NONE
  && !e->symtree->n.sym->attr.associate_var
  && e->ts.type != BT_CHARACTER
  && e->ts.type != BT_DERIVED
-- 
2.35.1



Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Mikael Morin

Le 18/09/2022 à 12:23, Thomas Koenig a écrit :


I think some of the desired behavior can still be salvaged.  For
example, for


(...)


clobbers for the whole array can still be generated, but not for

   call foo(a(2),2)

so one would have to look at the lower bound.


Well, my patches were for scalars only,
I prefer to stay away from arrays; they are just too full of traps.



Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-18 Thread Mikael Morin

Le 18/09/2022 à 20:32, Harald Anlauf a écrit :


Assumed shape will be on the easy side,
while assumed size likely needs to be excluded for clobbering.


Isn’t it the converse that is true?
Assumed shape can be non-contiguous so have to be excluded, but assumed 
size are contiguous, so valid candidates for clobbering. No?


No way, really, arrays are going to be a maze of complexity.



[PATCH] genrecog.cc (print_nonbool_test): Fix type error of SUBREG_BYTE

2022-09-18 Thread Jojo R via Gcc-patches
* gcc/genrecog.cc (print_nonbool_test): Fix type error of
SUBREG_BYTE
---
 gcc/genrecog.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/genrecog.cc b/gcc/genrecog.cc
index 77f8fb97853..319e437e334 100644
--- a/gcc/genrecog.cc
+++ b/gcc/genrecog.cc
@@ -4619,6 +4619,7 @@ print_nonbool_test (output_state *os, const rtx_test 
&test)
   printf ("SUBREG_BYTE (");
   print_test_rtx (os, test);
   printf (")");
+  printf (".to_constant ()");
   break;
 
 case rtx_test::WIDE_INT_FIELD:
-- 
2.24.3 (Apple Git-128)