[committed] openmp, fortran: Fix up IFN_ASSUME call

2022-10-10 Thread Jakub Jelinek via Fortran
On Thu, Oct 06, 2022 at 06:15:52PM +0200, Tobias Burnus wrote:
> Like as attached? – It did survive regtesting.

Like in other spots in trans-openmp.cc that create a TARGET_EXPR, the
slot has to be created with create_tmp_var_raw, because gfc_create_var
adds the var to BLOCK_VARS and that ICEs during expansion because
gimple_add_tmp_var_fn has:
  gcc_assert (!DECL_CHAIN (tmp) && !DECL_SEEN_IN_BIND_EXPR_P (tmp));
assertion.  Also, both C/C++ ensure the argument to IFN_ASSUME has
boolean_type_node, it is easier if Fortran does that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2022-10-10  Jakub Jelinek  

* trans-openmp.cc (gfc_trans_omp_assume): Use create_tmp_var_raw
instead of gfc_create_var for TARGET_EXPR slot creation.  Create it
with boolean_type_node and convert.

--- gcc/fortran/trans-openmp.cc.jj  2022-10-07 00:13:12.508191601 +0200
+++ gcc/fortran/trans-openmp.cc 2022-10-09 14:17:55.430364168 +0200
@@ -4588,11 +4588,14 @@ gfc_trans_omp_assume (gfc_code *code)
  t = se.expr;
else
  {
-   tree var = gfc_create_var (TREE_TYPE (se.expr), NULL);
+   tree var = create_tmp_var_raw (boolean_type_node);
+   DECL_CONTEXT (var) = current_function_decl;
stmtblock_t block2;
gfc_init_block (&block2);
gfc_add_block_to_block (&block2, &se.pre);
-   gfc_add_modify_loc (loc, &block2, var, se.expr);
+   gfc_add_modify_loc (loc, &block2, var,
+   fold_convert_loc (loc, boolean_type_node,
+ se.expr));
gfc_add_block_to_block (&block2, &se.post);
t = gfc_finish_block (&block2);
t = build4 (TARGET_EXPR, boolean_type_node, var, t, NULL, NULL);

Jakub



Re: [PATCH v4 4/4] OpenMP/OpenACC: Unordered/non-constant component offset struct mapping

2022-10-10 Thread Jakub Jelinek via Fortran
On Sun, Oct 09, 2022 at 02:51:37PM -0700, Julian Brown wrote:
> This patch adds support for non-constant component offsets in "map"
> clauses for OpenMP (and the equivalants for OpenACC), which are not able
> to be sorted into order at compile time.  Normally struct accesses in
> such clauses are gathered together and sorted into increasing address
> order after a "GOMP_MAP_STRUCT" node: if we have variable indices,
> that is no longer possible.
> 
> This patch adds support for such mappings by adding a new variant of
> GOMP_MAP_STRUCT that does not require the list of nodes following to
> be in sorted order.  This passes right down to the runtime: the list is
> sorted in libgomp according to the dynamic values of the offsets after
> the newly-introduced GOMP_MAP_STRUCT_UNORD node.
> 
> This mostly affects arrays of structs indexed by variables in C and C++,
> but can also affect derived-type arrays with constant indexes when they
> have an array descriptor.

I don't understand why this is needed.
We have a restriction that ought to make all such cases invalid:
"If multiple list items are explicitly mapped on the same construct and have 
the same containing
array or have base pointers that share original storage, and if any of the list 
items do not have
corresponding list items that are present in the device data environment prior 
to a task
encountering the construct, then the list items must refer to the same array 
elements of either the
containing array or the implicit array of the base pointers."

So, all those
#pragma omp target map(t->a[i].p, t->a[j].p) etc. cases are invalid unless
i == j, so IMNSHO one doesn't need to worry about unordered cases.

> +#if defined(_GNU_SOURCE) || defined(__GNUC__)
> +static int
> +compare_addr_r (const void *a, const void *b, void *data)
> +{
> +  void **hostaddrs = (void **) data;
> +  int ai = *(int *) a, bi = *(int *) b;
> +  if (hostaddrs[ai] < hostaddrs[bi])
> +return -1;
> +  else if (hostaddrs[ai] > hostaddrs[bi])
> +return 1;
> +  return 0;
> +}
> +#endif

Note, not all glibcs have qsort_r and _GNU_SOURCE macro being defined
doesn't imply glibc.  You'd need something like _GLIBC_PREREQ macro
and require 2.8 or later.

> +
>  static inline __attribute__((always_inline)) struct target_mem_desc *
>  gomp_map_vars_internal (struct gomp_device_descr *devicep,
>   struct goacc_asyncqueue *aq, size_t mapnum,
> @@ -968,6 +982,17 @@ gomp_map_vars_internal (struct gomp_device_descr 
> *devicep,
>tgt->device_descr = devicep;
>tgt->prev = NULL;
>struct gomp_coalesce_buf cbuf, *cbufp = NULL;
> +  size_t hostaddr_idx;
> +
> +#if !defined(_GNU_SOURCE) && defined(__GNUC__)
> +  /* If we don't have _GNU_SOURCE (thus no qsort_r), but we are compiling 
> with
> + GCC (and why wouldn't we be?), we can use this nested function for
> + regular qsort.  */
> +  int compare_addr (const void *a, const void *b)
> +{
> +  return compare_addr_r (a, b, (void *) &hostaddrs[hostaddr_idx]);
> +}
> +#endif

Please don't use nested functions in libgomp.

> +   int *order = NULL;
> +   if ((kind & typemask) == GOMP_MAP_STRUCT_UNORD)
> + {
> +   order = (int *) gomp_alloca (sizeof (int) * sizes[i]);
> +   for (int j = 0; j < sizes[i]; j++)
> + order[j] = j;
> +#ifdef _GNU_SOURCE
> +   qsort_r (order, sizes[i], sizeof (int), &compare_addr_r,
> +&hostaddrs[i + 1]);
> +#elif defined(__GNUC__)
> +   hostaddr_idx = i + 1;
> +   qsort (order, sizes[i], sizeof (int), &compare_addr);
> +#else
> +#error no threadsafe qsort
> +#endif

This is too ugly.  If this is really needed (see above) and
you need fallback for missing qsort_t, better sort array of tuples
containing the order number and some data pointer needed for the comparison
routine.

Jakub



Re: [PATCH v2 0/9] fortran: clobber fixes [PR41453]

2022-10-10 Thread Mikael Morin

Le 23/09/2022 à 09:54, Mikael Morin a écrit :

Le 22/09/2022 à 22:42, Harald Anlauf via Fortran a écrit :

This LGTM.  It also fixes a regression introduced with r9-3030 :-)
If you think that this set (1-3) is backportable, feel free to do so.


Yes, 2 and 3 are worth backporting, I will see how dependent they are on 1.


I'm going to backport 1-3 as you suggested, it's so much easier.


Re: [PATCH RESEND 0/1] RFC: P1689R5 support

2022-10-10 Thread Jason Merrill via Fortran

On 10/4/22 11:11, Ben Boeckel wrote:

This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

Testing is currently happening in CMake's CI using a prior revision of
this patch (the differences are basically the changelog, some style, and
`trtbd` instead of `p1689r5` as the format name).

For testing within GCC, I'll work on the following:

- scanning non-module source
- scanning module-importing source (`import X;`)
- scanning module-exporting source (`export module X;`)
- scanning module implementation unit (`module X;`)
- flag combinations?

Are there existing tools for handling JSON output for testing purposes?


David Malcolm would probably know best about JSON wrangling.


Basically, something that I can add to the test suite that doesn't care
about whitespace, but checks the structure (with sensible replacements
for absolute paths where relevant)?


Various tests in g++.dg/debug/dwarf2 handle that sort of thing with regexps.


For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

 https://reviews.llvm.org/D134269

with the same flags (though using my old `trtbd` spelling for the
format name).

Thanks,

--Ben

Ben Boeckel (1):
   p1689r5: initial support

  gcc/ChangeLog   |   9 ++
  gcc/c-family/ChangeLog  |   6 +
  gcc/c-family/c-opts.cc  |  40 ++-
  gcc/c-family/c.opt  |  12 ++
  gcc/cp/ChangeLog|   5 +
  gcc/cp/module.cc|   3 +-
  gcc/doc/invoke.texi |  15 +++
  gcc/fortran/ChangeLog   |   5 +
  gcc/fortran/cpp.cc  |   4 +-
  gcc/genmatch.cc |   2 +-
  gcc/input.cc|   4 +-
  libcpp/ChangeLog|  11 ++
  libcpp/include/cpplib.h |  12 +-
  libcpp/include/mkdeps.h |  17 ++-
  libcpp/init.cc  |  14 ++-
  libcpp/mkdeps.cc| 235 ++--
  16 files changed, 368 insertions(+), 26 deletions(-)


base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f




Re: [PATCH RESEND 1/1] p1689r5: initial support

2022-10-10 Thread Jason Merrill via Fortran

On 10/4/22 11:12, Ben Boeckel wrote:

This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.


Thanks!


Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
   `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.


Do you expect users to want to emit Makefile (-MM) and P1689 
dependencies from the same compilation?



- `-fdep-output=` specifies the `.o` that will be written for the TU
   that is scanned. This is required so that the build system can
   correlate the dependency output with the actual compilation that will
   occur.


The dependency machinery already needs to be able to figure out the name 
of the output file, can't we reuse that instead of specifying it on the 
command line?



CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

Signed-off-by: Ben Boeckel 
---
  gcc/ChangeLog   |   9 ++
  gcc/c-family/ChangeLog  |   6 +
  gcc/c-family/c-opts.cc  |  40 ++-
  gcc/c-family/c.opt  |  12 ++
  gcc/cp/ChangeLog|   5 +
  gcc/cp/module.cc|   3 +-
  gcc/doc/invoke.texi |  15 +++
  gcc/fortran/ChangeLog   |   5 +
  gcc/fortran/cpp.cc  |   4 +-
  gcc/genmatch.cc |   2 +-
  gcc/input.cc|   4 +-
  libcpp/ChangeLog|  11 ++
  libcpp/include/cpplib.h |  12 +-
  libcpp/include/mkdeps.h |  17 ++-
  libcpp/init.cc  |  14 ++-
  libcpp/mkdeps.cc| 235 ++--
  16 files changed, 368 insertions(+), 26 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6dded16c0e3..2d61de6adde 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2022-09-20  Ben Boeckel  
+
+   * doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
+   -fdep-output= flags.
+   * genmatch.cc (main): Add new preprocessor parameter used for C++
+   module tracking.
+   * input.cc (test_lexer): Add new preprocessor parameter used for C++
+   module tracking.
+
  2022-09-19  Torbjörn SVENSSON  
  
  	* targhooks.cc (default_zero_call_used_regs): Improve sorry

diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index ba3d76dd6cb..569dcd96e8c 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,9 @@
+2022-09-20  Ben Boeckel  
+
+   * c-opts.cc (c_common_handle_option): Add fdeps_file variable and
+   -fdeps-format=, -fdep-file=, and -fdep-output= parsing.
+   * c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.
+
  2022-09-15  Richard Biener  
  
  	* c-common.h (build_void_list_node): Remove.

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index babaa2fc157..617d0e93696 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -77,6 +77,9 @@ static bool verbose;
  /* Dependency output file.  */
  static const char *deps_file;
  
+/* Enhanced dependency output file.  */

+static const char *fdeps_file;
+
  /* The prefix given by -iprefix, if any.  */
  static const char *iprefix;
  
@@ -360,6 +363,23 @@ c_common_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,

deps_file = arg;
break;
  
+case OPT_fdep_format_:

+  if (!strcmp (arg, "p1689r5"))
+   cpp_opts->deps.format =