[patch] OpenMP/Fortran: Non-rectangular loops with constant steps other than 1 or -1 [PR107424]
Comments regarding the validity of the Fortran assumptions are welcome! This patch now uses a 'simple' loop for OpenMP loops with a constant loop-step size. Before, it only did so for step = ±1. (Otherwise, a count variable is used from which the original loop index variable is calculated from.) For details, see the attached patch or https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107424#c12 (comment 12 + 14 plus the email linked in comment 12). Comments? Remarks? If there are none, I will relatively soonish commit the attached patch to mainline, only. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 OpenMP/Fortran: Non-rectangular loops with constant steps other than 1 or -1 [PR107424] Before this commit, gfortran produced with OpenMP for 'do i = 1,10,2' the code for (count.0 = 0; count.0 < 5; count.0 = count.0 + 1) i = count.0 * 2 + 1; While such an inner loop can be collapsed, a non-rectangular could not. With this commit and for all constant loop steps, a simple loop such as 'for (i = 1; i <= 10; i = i + 2)' is created. (Before only for the constant steps of 1 and -1.) The constant step permits to know the direction (increasing/decreasing) that is required for the loop condition. The new code is only valid if one assumes no overflow of the loop variable. However, the Fortran standard can be read that this must be ensured by the user. Namely, the Fortran standard requires (F2023, 10.1.5.2.4): "The execution of any numeric operation whose result is not defined by the arithmetic used by the processor is prohibited." And, for DO loops, F2023's "11.1.7.4.3 The execution cycle" has the following: The number of loop iterations handled by an iteration count, which would permit code like 'do i = huge(i)-5, huge(i),4'. However, in step (3), this count is not only decremented by one but also: "... The DO variable, if any, is incremented by the value of the incrementation parameter m3." And for the example above, 'i' would be 'huge(i)+3' in the last execution cycle, which exceeds the largest model number and should render the example as invalid. PR fortran/107424 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_nonrect_loop_expr): Accept all constant loop steps. (gfc_trans_omp_do): Likewise; use sign to determine loop direction. libgomp/ChangeLog: * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: Enabled commented tests. * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: Removed. * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: Change testcase to use a non-constant step to retain the 'sorry' test. * testsuite/libgomp.fortran/non-rectangular-loop-6.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/linear-2.f90: Update dump to remove the additional count variable. gcc/fortran/trans-openmp.cc| 12 +- gcc/testsuite/gfortran.dg/gomp/linear-2.f90| 4 +- .../libgomp.fortran/non-rectangular-loop-1.f90 | 537 ++--- .../libgomp.fortran/non-rectangular-loop-1a.f90| 374 -- .../libgomp.fortran/non-rectangular-loop-5.f90 | 10 +- .../libgomp.fortran/non-rectangular-loop-6.f90 | 196 6 files changed, 477 insertions(+), 656 deletions(-) diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index c88ee3c7656..e33b18fdada 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -5374,10 +5374,10 @@ gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n, if (!simple) { - /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424. */ + /* FIXME: Handle non-const iter steps, cf. PR fortran/107424. */ sorry_at (gfc_get_location (&curr_loop_var->where), - "non-rectangular loop nest with step other than constant 1 " - "or -1 for %qs", curr_loop_var->symtree->n.sym->name); + "non-rectangular loop nest with non-constant step for %qs", + curr_loop_var->symtree->n.sym->name); return false; } @@ -5578,10 +5578,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock, gfc_add_block_to_block (pblock, &se.pre); step = gfc_evaluate_now (se.expr, pblock); - if (integer_onep (step)) - simple = 1; - else if (tree_int_cst_equal (step, integer_minus_one_node)) - simple = -1; + if (TREE_CODE (step) == INTEGER_CST) + simple = tree_int_cst_sgn (step); gfc_init_se (&se, NULL); if (!clauses->non_rectangular diff --git a/gcc/testsuite/gfortran.dg/gomp/linear-2.f90 b/gcc/testsuite/gfortran.dg/gomp/linear-2.f90 index 05f007fd5c2..88df96e9b8f 100644 --- a/gcc/testsuite/gfortran.dg/gomp/linear-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/linear-2.f90 @@ -105,8 +105,8 @@ end module ! { dg-final { scan-tree-dump-times "#pragma
Re: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies
On 6/25/23 12:36, Ben Boeckel wrote: On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: On 6/22/23 22:45, Ben Boeckel wrote: On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: On 1/25/23 16:06, Ben Boeckel wrote: They affect the build, so report them via `-MF` mechanisms. Why isn't this covered by the existing code in preprocessed_module? It appears as though it is neutered in patch 3 where `write_make_modules_deps` is used in `make_write` (or will use that name Why do you want to record the transitive modules? I would expect just noting the ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) FWIW, only GCC has "fat" modules. MSVC and Clang both require the transitive closure to be passed. The idea there is to minimize the size of individual module files. If GCC only reads the "fat" modules, then only those should be recorded. If it reads other modules, they should be recorded as well. But wouldn't the transitive modules be dependencies of the direct imports, so (re)building the direct imports would first require building the transitive modules anyway? Expressing the transitive closure of dependencies for each importer seems redundant when it can be easily derived from the direct dependencies of each module. Jason
Re: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies
On 7/18/23 16:52, Jason Merrill wrote: On 6/25/23 12:36, Ben Boeckel wrote: On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: On 6/22/23 22:45, Ben Boeckel wrote: On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: On 1/25/23 16:06, Ben Boeckel wrote: They affect the build, so report them via `-MF` mechanisms. Why isn't this covered by the existing code in preprocessed_module? It appears as though it is neutered in patch 3 where `write_make_modules_deps` is used in `make_write` (or will use that name Why do you want to record the transitive modules? I would expect just noting the ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) FWIW, only GCC has "fat" modules. MSVC and Clang both require the transitive closure to be passed. The idea there is to minimize the size of individual module files. If GCC only reads the "fat" modules, then only those should be recorded. If it reads other modules, they should be recorded as well. Please explain what you mean by fat modules. There seems to be confusion. But wouldn't the transitive modules be dependencies of the direct imports, so (re)building the direct imports would first require building the transitive modules anyway? Expressing the transitive closure of dependencies for each importer seems redundant when it can be easily derived from the direct dependencies of each module. Jason -- Nathan Sidwell
Re: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies
On Tue, Jul 18, 2023 at 16:52:44 -0400, Jason Merrill wrote: > On 6/25/23 12:36, Ben Boeckel wrote: > > On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: > >> On 6/22/23 22:45, Ben Boeckel wrote: > >>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: > On 1/25/23 16:06, Ben Boeckel wrote: > > They affect the build, so report them via `-MF` mechanisms. > > Why isn't this covered by the existing code in preprocessed_module? > >>> > >>> It appears as though it is neutered in patch 3 where > >>> `write_make_modules_deps` is used in `make_write` (or will use that name > >> > >> Why do you want to record the transitive modules? I would expect just > >> noting the > >> ones with imports directly in the TU would suffice (i.e check the > >> 'outermost' arg) > > > > FWIW, only GCC has "fat" modules. MSVC and Clang both require the > > transitive closure to be passed. The idea there is to minimize the size > > of individual module files. > > > > If GCC only reads the "fat" modules, then only those should be recorded. > > If it reads other modules, they should be recorded as well. For clarification, given: * a.cppm ``` export module a; ``` * b.cppm ``` export module b; import a; ``` * use.cppm ``` import b; ``` in a "fat" module setup, `use.cppm` only needs to be told about `b.cmi` because it contains everything that an importer needs to know about the `a` module (reachable types, re-exported bits, whatever). With the "thin" modules, `a.cmi` must be specified when compiling `use.cppm` to satisfy anything that may be required transitively (e.g., a return type of some `b` symbol is from `a`). MSVC and Clang (17+) use this model. I don't know MSVC's rationale, but Clang's is to make CMIs relocatable by not embedding paths to dependent modules in CMIs. This should help caching and network transfer sizes in distributed builds. LLVM/Clang discussion: https://discourse.llvm.org/t/c-20-modules-should-the-bmis-contain-paths-to-their-dependent-bmis/70422 https://github.com/llvm/llvm-project/issues/62707 Maybe I'm missing how this *actually* works in GCC as I've really only interacted with it through the command line, but I've not needed to mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and read through some embedded information in `b.cmi` or does `b.cmi` include enough information to not need to read it at all? If the former, distributed builds are going to have a problem knowing what files to send just from the command line (I'll call this "implicit thin"). If the latter, that is the "fat" CMI that I'm thinking of. > But wouldn't the transitive modules be dependencies of the direct > imports, so (re)building the direct imports would first require building > the transitive modules anyway? Expressing the transitive closure of > dependencies for each importer seems redundant when it can be easily > derived from the direct dependencies of each module. I'm not concerned whether it is transitive or not, really. If a file is read, it should be reported here regardless of the reason. Note that caching mechanisms may skip actually *doing* the reading, but the dependencies should still be reported from the cached results as-if the real work had been performed. --Ben