Re: Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?
Sam James writes: > Andre Vehreschild writes: > >> Hi all, >> >> please note, that I don't know this bisecting very well, so this may very >> well >> be a wrong blame. During latest regression testing of the Fortran suite I got >> typebound_operator_7.f03 failing with: >> >> typebound_operator_7.f03:94:25: >> >>94 | u = (u*2.0*4.0) + u*4.0 >> | 1 >> internal compiler error: tree check: expected function_decl, have >> indirect_ref >>in DECL_FUNCTION_CODE, at tree.h:4329 0x3642f3e internal_error(char >> const*, >>...) /mnt/work_store/gcc/gcc.test/gcc/diagnostic-global-context.cc:517 >> 0x1c0a703 tree_check_failed(tree_node const*, char const*, int, char const*, >>...) /mnt/work_store/gcc/gcc.test/gcc/tree.cc:9003 >> 0xeb9150 tree_check(tree_node const*, char const*, int, char const*, >> tree_code) >> /mnt/work_store/gcc/gcc.test/gcc/tree.h:3921 >> 0xf5725b DECL_FUNCTION_CODE(tree_node const*) >> /mnt/work_store/gcc/gcc.test/gcc/tree.h:4329 >> 0xf383d6 update_builtin_function >> /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:4405 >> 0xf468b9 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >>gfc_expr*, vec*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8236 0xf48b0f >>gfc_conv_function_expr >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda >>gfc_conv_expr(gfc_se*, gfc_expr*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 >>gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >>gfc_expr*, vec*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xf48b0f >>gfc_conv_function_expr >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda >>gfc_conv_expr(gfc_se*, gfc_expr*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 >>gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >>gfc_expr*, vec*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xfb580a >>gfc_trans_call(gfc_code*, bool, tree_node*, tree_node*, bool) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-stmt.cc:425 0xed9363 >>trans_code /mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2434 0xed97d5 >>gfc_trans_code(gfc_code*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2713 0xf26342 >>gfc_generate_function_code(gfc_namespace*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-decl.cc:7958 0xed9819 >>gfc_generate_code(gfc_namespace*) >>/mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2730 0xe544ee >>translate_all_program_units >>/mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7156 0xe54e23 >>gfc_parse_file() /mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7473 >>0xebf7ce gfc_be_parse_file >>/mnt/work_store/gcc/gcc.test/gcc/fortran/f95-lang.cc:241 >> >> Checking with git bisect this lead me to: >> >> d8ef4471cb9c9f86784b62424a215ea42173bfe1 being the last commit the test >> passed >> >> and >> >> 03623fa91ff36ecb9faa3b55f7842a39b759594e libstdc++: Use std::move for >> iterator >> in ranges::fill [PR117094] >> >> failing the test to pass. Can anyone confirm? >> >> I might be doing something wrong here, so please be patient and explain, >> what I >> miss. > > You can confirm this by reverting the commit to see if it starts to pass > again. > > Also, if manually bisecting, it can be worth doing each commit twice if > unfamiliar with it (or use `git bisect run` instead). > > Now, according to gcc-regression > (https://inbox.sourceware.org/gcc-regression/20241013105730.f2e0614a0...@shgcc06.sh.intel.com/T/#u), > it started between r15-4295 and r15-4298, although it didn't bisect (it > does sometimes but this isn't one of those emails): > > $ git shortlog $(contrib/git-undescr.sh r15-4295)~1..$(contrib/git-undescr.sh > r15-4298) > GCC Administrator (1): > Daily bump. > > Jivan Hakobyan (1): > [RISC-V] Avoid unnecessary extensions when value is already extended > > Thomas Koenig (1): > Unsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING. > > Tobias Burnus (1): > Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' > __builtin_is_initial_device > > Those 2 Fortran candidates seem more likely, but not that I know > anything about Fortran. Indeed: https://gcc.gnu.org/PR117136. > >> >> Regards, >> Andre > > thanks, > sam
Re: Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?
Andre Vehreschild writes: > Hi all, > > please note, that I don't know this bisecting very well, so this may very well > be a wrong blame. During latest regression testing of the Fortran suite I got > typebound_operator_7.f03 failing with: > > typebound_operator_7.f03:94:25: > >94 | u = (u*2.0*4.0) + u*4.0 > | 1 > internal compiler error: tree check: expected function_decl, have indirect_ref >in DECL_FUNCTION_CODE, at tree.h:4329 0x3642f3e internal_error(char const*, >...) /mnt/work_store/gcc/gcc.test/gcc/diagnostic-global-context.cc:517 > 0x1c0a703 tree_check_failed(tree_node const*, char const*, int, char const*, >...) /mnt/work_store/gcc/gcc.test/gcc/tree.cc:9003 > 0xeb9150 tree_check(tree_node const*, char const*, int, char const*, > tree_code) > /mnt/work_store/gcc/gcc.test/gcc/tree.h:3921 > 0xf5725b DECL_FUNCTION_CODE(tree_node const*) > /mnt/work_store/gcc/gcc.test/gcc/tree.h:4329 > 0xf383d6 update_builtin_function > /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:4405 > 0xf468b9 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >gfc_expr*, vec*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8236 0xf48b0f >gfc_conv_function_expr >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda >gfc_conv_expr(gfc_se*, gfc_expr*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 >gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >gfc_expr*, vec*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xf48b0f >gfc_conv_function_expr >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda >gfc_conv_expr(gfc_se*, gfc_expr*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 >gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, >gfc_expr*, vec*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xfb580a >gfc_trans_call(gfc_code*, bool, tree_node*, tree_node*, bool) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-stmt.cc:425 0xed9363 >trans_code /mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2434 0xed97d5 >gfc_trans_code(gfc_code*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2713 0xf26342 >gfc_generate_function_code(gfc_namespace*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans-decl.cc:7958 0xed9819 >gfc_generate_code(gfc_namespace*) >/mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2730 0xe544ee >translate_all_program_units >/mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7156 0xe54e23 >gfc_parse_file() /mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7473 >0xebf7ce gfc_be_parse_file >/mnt/work_store/gcc/gcc.test/gcc/fortran/f95-lang.cc:241 > > Checking with git bisect this lead me to: > > d8ef4471cb9c9f86784b62424a215ea42173bfe1 being the last commit the test passed > > and > > 03623fa91ff36ecb9faa3b55f7842a39b759594e libstdc++: Use std::move for iterator > in ranges::fill [PR117094] > > failing the test to pass. Can anyone confirm? > > I might be doing something wrong here, so please be patient and explain, what > I > miss. You can confirm this by reverting the commit to see if it starts to pass again. Also, if manually bisecting, it can be worth doing each commit twice if unfamiliar with it (or use `git bisect run` instead). Now, according to gcc-regression (https://inbox.sourceware.org/gcc-regression/20241013105730.f2e0614a0...@shgcc06.sh.intel.com/T/#u), it started between r15-4295 and r15-4298, although it didn't bisect (it does sometimes but this isn't one of those emails): $ git shortlog $(contrib/git-undescr.sh r15-4295)~1..$(contrib/git-undescr.sh r15-4298) GCC Administrator (1): Daily bump. Jivan Hakobyan (1): [RISC-V] Avoid unnecessary extensions when value is already extended Thomas Koenig (1): Unsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING. Tobias Burnus (1): Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device Those 2 Fortran candidates seem more likely, but not that I know anything about Fortran. > > Regards, > Andre thanks, sam
Re: Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?
On 14 October 2024 14:05:36 CEST, Andre Vehreschild wrote: >Hi all, > >please note, that I don't know this bisecting very well, so this may very well >be a wrong blame. During latest regression testing of the Fortran suite I got >typebound_operator_7.f03 failing with: > > >03623fa91ff36ecb9faa3b55f7842a39b759594e libstdc++: Use std::move for iterator >in ranges::fill [PR117094] > >failing the test to pass. Can anyone confirm? I think it was reverted already > >I might be doing something wrong here, so please be patient and explain, what I >miss. >
Re: [Patch] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device
On 14 October 2024 10:23:56 CEST, Thomas Schwinge wrote: >Hi Tobias! > >On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > >> Tobias Burnus wrote: >>> Anyone feeling like reviewing this patch? > >Yes. But please allow for more than 1 1/2 work days. >> * types.def (BT_BOOL): Fix type by using bool_type_node. let's try this hunk - should work'ish
Re: [Ping*4, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]
Hi Andre, This looks fine to me. OK for mainline. Thanks for the patch and sorry for the wait for review. Paul On Mon, 14 Oct 2024 at 08:50, Andre Vehreschild wrote: > Ping ^ 4. > > Really no one to review this 160 something patch? > > Regtests ok on x86_64-pc-linux-gnu /Fedora 39? Ok for mainline? > > - Andre > > On Mon, 7 Oct 2024 12:52:29 +0200 > Andre Vehreschild wrote: > > > Hi all, > > > > this patch somehow slipped my attention. Anyone for a review? Third time > ping! > > > > Rebased to current mainline. > > > > Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > > > Regards, > > Andre > > > > On Wed, 18 Sep 2024 12:30:23 +0200 > > Andre Vehreschild wrote: > > > > > Hi all, > > > > > > back from my holidays and still no review. PING PING! > > > > > > Rebased to current mainline. > > > > > > Regtested ok on x86_64-pc-linux-gnu / F39. Ok for mainline? > > > > > > Regards, > > > Andre > > > > > > On Wed, 21 Aug 2024 13:43:52 +0200 > > > Andre Vehreschild wrote: > > > > > > > Hi all, > > > > > > > > pinging this patch for the first time. > > > > > > > > Rebased and regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for > > > > mainline? > > > > > > > > - Andre > > > > > > > > On Thu, 15 Aug 2024 14:39:25 +0200 > > > > Andre Vehreschild wrote: > > > > > > > > > Hi all, > > > > > > > > > > attached patch fixes another regression on coarrays. This time for > class > > > > > typed coarrays as dummys. > > > > > > > > > > Regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > > > > > > > > > Regards, > > > > > Andre > > > > > -- > > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > > > > -- > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > -- > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de >
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Revert 'gimple_fold_builtin_acc_on_device' change
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > * (new) For OpenACC, use a builtin for acc_on_device + actually do > compile-time optimization when offloading is not configured. > > No. 2. This resolved > PR82250 "Fortran OpenACC acc_on_device early folding", right? > (..., which you recently had duplicated as > PR116269 "[OpenACC] acc_on_device – compile-time optimization fails", > right?) > > Please: > > git mv gfortran.dg/goacc/acc_on_device-2{-off,_-fno-openacc}.f95 > > ..., and add a 's%-fno-openacc%-fno-builtin-acc_on_device' variant. > > Hmm, why can't 'gfortran.dg/goacc/acc_on_device-2.f95' be un-XFAILed? > PS: The testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c > example is not completely clear to me; however, the new optimization > causes that without offloading enabled, the dump message is not > shown. I tried to understand it better with > -fno-builtin-acc_on_device, but that then caused link errors as the > device function wasn't optimizated away, leaving me puzzled. — At > the end, I just changed the dg-* and did not try to understand the > issue. > > Why then not wait for someone else to help look into that? :-) > On 2024-10-10T10:31:13+0200, Tobias Burnus wrote: >> Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' >> __builtin_is_initial_device >> Extend the code to also use the builtin acc_on_device with OpenACC, >> which was previously only used in C/C++. Additionally, fix folding >> when offloading is not enabled. I don't understand the latter part: what needs to be fixed? >> gcc/ChangeLog: >> >> * gimple-fold.cc (gimple_fold_builtin_acc_on_device): Also fold >> when offloading is not configured. We already did fold, didn't we? >> --- a/gcc/gimple-fold.cc >> +++ b/gcc/gimple-fold.cc >> @@ -4190,7 +4190,7 @@ static bool >> gimple_fold_builtin_acc_on_device (gimple_stmt_iterator *gsi, tree arg0) >> { >>/* Defer folding until we know which compiler we're in. */ >> - if (symtab->state != EXPANSION) >> + if (ENABLE_OFFLOADING && symtab->state != EXPANSION) >> return false; >> >>unsigned val_host = GOMP_DEVICE_HOST; That is, I don't understand the rationale for diverging GCC's (default) '--disable-offload-targets' vs. '--enable-offload-targets=[...]' configurations here? >> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c >> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c >> @@ -36,8 +36,7 @@ static int fact_nohost(int n) >> >>return fact(n); >> } >> -/* { dg-final { scan-tree-dump-times {(?n)^OpenACC routine 'fact_nohost' >> has 'nohost' clause\.$} 1 oaccloops { target c } } } >> - { dg-final { scan-tree-dump-times {(?n)^OpenACC routine 'int >> fact_nohost\(int\)' has 'nohost' clause\.$} 1 oaccloops { target { c++ && { >> ! offloading_enabled } } } } } >> +/* { dg-final { scan-tree-dump-times {(?n)^OpenACC routine 'fact_nohost' >> has 'nohost' clause\.$} 1 oaccloops { target { c && offloading_enabled } } } >> } >> { dg-final { scan-tree-dump-times {(?n)^OpenACC routine >> 'fact_nohost\(int\)' has 'nohost' clause\.$} 1 oaccloops { target { c++ && >> offloading_enabled } } } } >> TODO See PR101551 for 'offloading_enabled' differences. */ OK to push the attached "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Revert 'gimple_fold_builtin_acc_on_device' change"? Grüße Thomas >From d4cf1d795a70b35082ec33315efe9e49fa6b0cbf Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 14 Oct 2024 10:45:06 +0200 Subject: [PATCH] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Revert 'gimple_fold_builtin_acc_on_device' change The motivation of the 'gimple_fold_builtin_acc_on_device' change in commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11 "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device" is unclear, and it unnecessarily diverges GCC's (default) '--disable-offload-targets' vs. '--enable-offload-targets=[...]' configurations. PR testsuite/82250 gcc/ * gimple-fold.cc (gimple_fold_builtin_acc_on_device): Revert last change. libgomp/ * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Revert last change. --- gcc/gimple-fold.cc | 2 +- libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 9a84483f9bf..942de7720fd 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -4190,7 +4190,7 @@ static bool gimple_fold_builtin_acc_on_device (gimple_stmt_iterator *gsi, tree arg0) { /* Defer folding until we know which compiler we're in. */ - if (ENABLE_OFFLOADING && symtab->state != EXPANSION) + if (symtab->state != EXPANSION) ret
Re: [Ping*4, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]
Ping ^ 4. Really no one to review this 160 something patch? Regtests ok on x86_64-pc-linux-gnu /Fedora 39? Ok for mainline? - Andre On Mon, 7 Oct 2024 12:52:29 +0200 Andre Vehreschild wrote: > Hi all, > > this patch somehow slipped my attention. Anyone for a review? Third time ping! > > Rebased to current mainline. > > Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > Regards, > Andre > > On Wed, 18 Sep 2024 12:30:23 +0200 > Andre Vehreschild wrote: > > > Hi all, > > > > back from my holidays and still no review. PING PING! > > > > Rebased to current mainline. > > > > Regtested ok on x86_64-pc-linux-gnu / F39. Ok for mainline? > > > > Regards, > > Andre > > > > On Wed, 21 Aug 2024 13:43:52 +0200 > > Andre Vehreschild wrote: > > > > > Hi all, > > > > > > pinging this patch for the first time. > > > > > > Rebased and regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for > > > mainline? > > > > > > - Andre > > > > > > On Thu, 15 Aug 2024 14:39:25 +0200 > > > Andre Vehreschild wrote: > > > > > > > Hi all, > > > > > > > > attached patch fixes another regression on coarrays. This time for class > > > > typed coarrays as dummys. > > > > > > > > Regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > > > > > > > Regards, > > > > Andre > > > > -- > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > -- > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- Andre Vehreschild * Email: vehre ad gmx dot de From 48e77542f0e3342c5da31ecce1b229fa3fbbdaa2 Mon Sep 17 00:00:00 2001 From: Andre Vehreschild Date: Thu, 15 Aug 2024 13:49:49 +0200 Subject: [PATCH] [Fortran] Allow for class type coarray parameters. [PR77871] gcc/fortran/ChangeLog: PR fortran/77871 * trans-expr.cc (gfc_conv_derived_to_class): Assign token when converting a coarray to class. (gfc_get_tree_for_caf_expr): For classes get the caf decl from the saved descriptor. (gfc_get_caf_token_offset):Assert that coarray=lib is set and cover more cases where the tree having the coarray token can be. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): Use unified test for pointers. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/dummy_3.f90: New test. --- gcc/fortran/trans-expr.cc | 36 --- gcc/fortran/trans-intrinsic.cc| 2 +- gcc/testsuite/gfortran.dg/coarray/dummy_3.f90 | 33 + 3 files changed, 58 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/coarray/dummy_3.f90 diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 9f223a1314a..4065ea2a735 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -810,6 +810,16 @@ gfc_conv_derived_to_class (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym, /* Now set the data field. */ ctree = gfc_class_data_get (var); + if (flag_coarray == GFC_FCOARRAY_LIB && CLASS_DATA (fsym)->attr.codimension) +{ + tree token; + tmp = gfc_get_tree_for_caf_expr (e); + if (POINTER_TYPE_P (TREE_TYPE (tmp))) + tmp = build_fold_indirect_ref (tmp); + gfc_get_caf_token_offset (parmse, &token, nullptr, tmp, NULL_TREE, e); + gfc_add_modify (&parmse->pre, gfc_conv_descriptor_token (ctree), token); +} + if (optional) cond_optional = gfc_conv_expr_present (e->symtree->n.sym); @@ -2344,6 +2354,10 @@ gfc_get_tree_for_caf_expr (gfc_expr *expr) if (expr->symtree->n.sym->ts.type == BT_CLASS) { + if (DECL_P (caf_decl) && DECL_LANG_SPECIFIC (caf_decl) + && GFC_DECL_SAVED_DESCRIPTOR (caf_decl)) + caf_decl = GFC_DECL_SAVED_DESCRIPTOR (caf_decl); + if (expr->ref && expr->ref->type == REF_ARRAY) { caf_decl = gfc_class_data_get (caf_decl); @@ -2408,16 +2422,12 @@ gfc_get_caf_token_offset (gfc_se *se, tree *token, tree *offset, tree caf_decl, { tree tmp; + gcc_assert (flag_coarray == GFC_FCOARRAY_LIB); + /* Coarray token. */ if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (caf_decl))) -{ - gcc_assert (GFC_TYPE_ARRAY_AKIND (TREE_TYPE (caf_decl)) - == GFC_ARRAY_ALLOCATABLE - || expr->symtree->n.sym->attr.select_type_temporary - || expr->symtree->n.sym->assoc); *token = gfc_conv_descriptor_token (caf_decl); -} - else if (DECL_LANG_SPECIFIC (caf_decl) + else if (DECL_P (caf_decl) && DECL_LANG_SPECIFIC (caf_decl) && GFC_DECL_TOKEN (caf_decl) != NULL_TREE) *token = GFC_DECL_TOKEN (caf_decl); else @@ -2435,7 +2445,7 @@ gfc_get_caf_token_offset (gfc_se *se, tree *token, tree *offset, tree caf_decl, && (GFC_TYPE_ARRAY_AKIND (TREE_TYPE (caf_decl)) == GFC_ARRAY_ALLOCATABLE || GFC_TYPE_ARRAY_AKIND (TREE_TYPE (caf_decl)) == GFC_ARRAY_POINTER)) *offset = build_int_cst (gfc_array_index_type, 0); - else if (DECL_LANG_SPECIFIC (caf_decl) + else if (DECL_P (caf_decl) && DECL_LANG_SPECIFIC (caf
Re: [Patch] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device
Hi Tobias! On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: > Now pushed as r15-4298-g3269a722b7a036. > Tobias Burnus wrote: >> Anyone feeling like reviewing this patch? Yes. But please allow for more than 1 1/2 work days. >> Tobias Burnus write: >>> Tobias Burnus wrote: Sometimes waiting a bit leads to better code … Tobias Burnus wrote: > ... > [I guess, we eventually want to add support for more builtins. For > instance, acc_on_device would be a candidate, but I could imagine > some additional builtins.] I have now implemented acc_on_device and I think the new fix-up function is way is nicer. Thanks for looking into this! Thus, this patch does: I wonder why you didn't make these several orthogonal changes into several separate patches? * (v1) Fix omp_is_initial_device → do only replace when used in calls (and not when used as function pointer/actual to a dummy function) + fix ICE due to integer(4) != logical(4) in the middle end. No. 1. * (new) For OpenACC, use a builtin for acc_on_device + actually do compile-time optimization when offloading is not configured. No. 2. This resolved PR82250 "Fortran OpenACC acc_on_device early folding", right? (..., which you recently had duplicated as PR116269 "[OpenACC] acc_on_device – compile-time optimization fails", right?) Please: git mv gfortran.dg/goacc/acc_on_device-2{-off,_-fno-openacc}.f95 ..., and add a 's%-fno-openacc%-fno-builtin-acc_on_device' variant. Hmm, why can't 'gfortran.dg/goacc/acc_on_device-2.f95' be un-XFAILed? * (new) libgomp.texi: Typo fixes accumulated, fix wording No. 3. and for acc_on_device, add a note that compile-time folding may be done (and how it can be disabled). Into No. 2. For OpenACC, I now mix compile time folding vs. runtime to ensure that it works. No. 4. And this: --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def -DEF_PRIMITIVE_TYPE (BT_BOOL, - (*lang_hooks.types.type_for_size) (BOOL_TYPE_SIZE, 1)) +DEF_PRIMITIVE_TYPE (BT_BOOL, boolean_type_node) ... is yet another unrelated change? No. 5. Tested on x86-64 without and with offloading configured, running with nvptx offloading. PS: The testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c example is not completely clear to me; however, the new optimization causes that without offloading enabled, the dump message is not shown. I tried to understand it better with -fno-builtin-acc_on_device, but that then caused link errors as the device function wasn't optimizated away, leaving me puzzled. — At the end, I just changed the dg-* and did not try to understand the issue. Why then not wait for someone else to help look into that? :-) On 2024-10-10T10:31:13+0200, Tobias Burnus wrote: > Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' > __builtin_is_initial_device Missing 'omp_' in '__builtin_[omp_]is_initial_device'. I just received a SIGKID; to be continued in follow-on emails. Grüße Thomas > It turned out that 'if (omp_is_initial_device() .eqv. true)' gave an ICE > due to comparing 'int' with 'logical(4)'. When digging deeper, it also > turned out that when the procedure pointer is needed, the builtin cannot > be used, either. (Follow up to r15-2799-gf1bfba3a9b3f31 ) > Extend the code to also use the builtin acc_on_device with OpenACC, > which was previously only used in C/C++. Additionally, fix folding > when offloading is not enabled. > > Fixes additionally the BT_BOOL data type, which was 'char'/integer(1) > instead of bool, backing the booleaness; use bool_type_node as the rest > of GCC. > > gcc/fortran/ChangeLog: > > * gfortran.h (gfc_option_t): Add disable_acc_on_device. > * options.cc (gfc_handle_option): Handle -fno-builtin-acc_on_device. > * trans-decl.cc (gfc_get_extern_function_decl): Move > __builtin_omp_is_initial_device handling to ... > * trans-expr.cc (get_builtin_fn): ... this new function. > (conv_function_val): Call it. > (update_builtin_function): New. > (gfc_conv_procedure_call): Call it. > * types.def (BT_BOOL): Fix type by using bool_type_node. > > gcc/ChangeLog: > > * gimple-fold.cc (gimple_fold_builtin_acc_on_device): Also fold > when offloading is not configured. > > libgomp/ChangeLog: > > * libgomp.texi (TR13): Fix minor typos. > (omp_is_initial_device): Improve wording. > (acc_on_device): Note how to disable the builtin. > * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove TODO. > * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. > Add -fno-builtin-acc_on_device. > * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. > * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Update > dg- as !offloading_enabled now compile-time e
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix effective-target keyword in 'libgomp.oacc-fortran/acc_on_device-2.f90'
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > Tested on x86-64 without and with offloading configured, running > with nvptx offloading. I see an UNRESOLVED: +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O scan-tree-dump-not optimized "acc_on_device" +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O scan-tree-dump-times gimple "acc_on_device" 1 +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O (test for excess errors) +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O at line 37 (test for warnings, line 36) +UNRESOLVED: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O scan-nvptx-none-offload-tree-dump-not optimized "acc_on_device" +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O scan-tree-dump-not optimized "acc_on_device" +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O scan-tree-dump-times gimple "acc_on_device" 1 +PASS: libgomp.oacc-fortran/acc_on_device-2.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O (test for excess errors) >> --- /dev/null >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 >> @@ -0,0 +1,40 @@ >> +! { dg-do link } >> + >> +! Check whether 'acc_on_device()' is properly compile-time optimized. */ >> + >> +! { dg-additional-options "-fdump-tree-gimple -fdump-tree-optimized" } >> +! { dg-additional-options -foffload-options=-fdump-tree-optimized { target >> { offload_device_nvptx || offload_target_amdgcn } } } >> + >> +! { dg-final { scan-tree-dump-times "acc_on_device" 1 "gimple" } } >> + >> +! { dg-final { scan-tree-dump-not "acc_on_device" "optimized" } } >> + >> +! { dg-final { only_for_offload_target amdgcn-amdhsa >> scan-offload-tree-dump-not "acc_on_device" "optimized" { target >> offload_target_amdgcn } } } >> +! { dg-final { only_for_offload_target nvptx-none >> scan-offload-tree-dump-not "acc_on_device" "optimized" { target >> offload_target_nvptx } } } Pushed to trunk branch commit c3774b2e2d7d00ad9f9f6fce10aa6bc872bd951f "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix effective-target keyword in 'libgomp.oacc-fortran/acc_on_device-2.f90'", see attached. Grüße Thomas >> + >> + >> +module m >> + integer :: >> + !$acc declare device_resident() >> +contains >> + subroutine set_var >> +!$acc routine >> +use openacc >> +implicit none (type, external) >> +if (acc_on_device(acc_device_host)) then >> + = 1234 >> +else >> + = 4242 >> +end if >> + end >> +end module m >> + >> + >> +program main >> + use m >> + call set_var >> + !$acc serial >> +! { dg-warning "using 'vector_length \\(32\\)', ignoring 1" "" { target >> openacc_nvidia_accel_selected } .-1 } >> +call set_var >> + !$acc end serial >> +end >From c3774b2e2d7d00ad9f9f6fce10aa6bc872bd951f Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 14 Oct 2024 10:26:13 +0200 Subject: [PATCH] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix effective-target keyword in 'libgomp.oacc-fortran/acc_on_device-2.f90' The test case 'libgomp.oacc-fortran/acc_on_device-2.f90' added in commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11 "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device" had a mismatch between dump file production and its scanning; the former needs to use 'offload_target_nvptx' (like 'offload_target_amdgcn'), not 'offload_device_nvptx'. PR testsuite/82250 libgomp/ * testsuite/libgomp.oacc-fortran/acc_on_device-2.f90: Fix effective-target keyword. --- libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 index 39d4357dd55..306555cfccf 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-2.f90 @@ -3,7 +3,7 @@ ! Check whether 'acc_on_device()' is properly compile-time optimized. */ ! { dg-additional-options "-fdump-tree-gimple -fdump-tree-optimized" } -! { dg-additional-options -foffload-options=-fdump-tree-optimized { target { offload_device_nvptx || offload_target_amdgcn } } } +! { dg-additional-options -foffload-options=-fdump-tree-optimized { target { offload_target_nvptx || offlo
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. >> --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 >> -! TODO: Have to disable the acc_on_device builtin for we want to test the >> -! libgomp library function? The command line option >> -! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not for >> -! Fortran. Here, you've just remove the comment, whereas... >> --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f >> @@ -1,5 +1,5 @@ >> ! { dg-do run } >> -! { dg-additional-options "-cpp" } >> +! { dg-additional-options "-cpp -fno-builtin-acc_on_device" } >> -! TODO: Have to disable the acc_on_device builtin for we want to test >> -! the libgomp library function? The command line option >> -! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not >> -! for Fortran. ... here, and... >> --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f >> @@ -1,5 +1,5 @@ >> ! { dg-do run } >> -! { dg-additional-options "-cpp" } >> +! { dg-additional-options "-cpp -fno-builtin-acc_on_device" } >> -! TODO: Have to disable the acc_on_device builtin for we want to test >> -! the libgomp library function? The command line option >> -! '-fno-builtin-acc_on_device' is valid for C/C++/ObjC/ObjC++ but not >> -! for Fortran. ... here, you also specify '-fno-builtin-acc_on_device'. This should be done in the former, too, and some explanation be added, like in 'libgomp.oacc-c-c++-common/acc_on_device-1.c'. Pushed to trunk branch commit 9f549d216c9716e787aaa38593bc9f83195b60ae "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'", see attached. Grüße Thomas >From 9f549d216c9716e787aaa38593bc9f83195b60ae Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 14 Oct 2024 10:34:34 +0200 Subject: [PATCH] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*' The test case 'libgomp.oacc-fortran/acc_on_device-1-1.f90' added in commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11 "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device" was missing '-fno-builtin-acc_on_device', and all 'libgomp.oacc-fortran/acc_on_device-1-*' need comments, why that option is specified. PR testsuite/82250 libgomp/ * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Add '-fno-builtin-acc_on_device'. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Comment. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Comment. --- libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 | 3 +++ libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f | 5 - libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f | 5 - 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 index 89748204f05..774c2b869e8 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 @@ -1,6 +1,9 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } +! Disable the acc_on_device builtin; we want to test the libgomp library function. +! { dg-additional-options -fno-builtin-acc_on_device } + ! { dg-additional-options "-fopt-info-all-omp" } ! { dg-additional-options "--param=openacc-privatization=noisy" } ! { dg-additional-options "-foffload=-fopt-info-all-omp" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f index e31e0fc715b..b57beac6f43 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f @@ -1,5 +1,8 @@ ! { dg-do run } -! { dg-additional-options "-cpp -fno-builtin-acc_on_device" } +! { dg-additional-options "-cpp" } + +! Disable the acc_on_device builtin; we want to test the libgomp library function. +! { dg-additional-options -fno-builtin-acc_on_device } ! { dg-additional-options "-fopt-info-all-omp" } ! { dg-additional-options "--param=openacc-privatization=noisy" } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f index 0595be241f8..969d530e30f 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f @@ -1,5 +1,8 @@ ! { dg-do run } -! { dg-additional-options "-cpp -fno-builtin-acc_on_device" } +! { dg-additional-options "
Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?
Hi all, please note, that I don't know this bisecting very well, so this may very well be a wrong blame. During latest regression testing of the Fortran suite I got typebound_operator_7.f03 failing with: typebound_operator_7.f03:94:25: 94 | u = (u*2.0*4.0) + u*4.0 | 1 internal compiler error: tree check: expected function_decl, have indirect_ref in DECL_FUNCTION_CODE, at tree.h:4329 0x3642f3e internal_error(char const*, ...) /mnt/work_store/gcc/gcc.test/gcc/diagnostic-global-context.cc:517 0x1c0a703 tree_check_failed(tree_node const*, char const*, int, char const*, ...) /mnt/work_store/gcc/gcc.test/gcc/tree.cc:9003 0xeb9150 tree_check(tree_node const*, char const*, int, char const*, tree_code) /mnt/work_store/gcc/gcc.test/gcc/tree.h:3921 0xf5725b DECL_FUNCTION_CODE(tree_node const*) /mnt/work_store/gcc/gcc.test/gcc/tree.h:4329 0xf383d6 update_builtin_function /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:4405 0xf468b9 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, gfc_expr*, vec*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8236 0xf48b0f gfc_conv_function_expr /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda gfc_conv_expr(gfc_se*, gfc_expr*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, gfc_expr*, vec*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xf48b0f gfc_conv_function_expr /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:8815 0xf4ceda gfc_conv_expr(gfc_se*, gfc_expr*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:9982 0xf40777 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, gfc_expr*, vec*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-expr.cc:6816 0xfb580a gfc_trans_call(gfc_code*, bool, tree_node*, tree_node*, bool) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-stmt.cc:425 0xed9363 trans_code /mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2434 0xed97d5 gfc_trans_code(gfc_code*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2713 0xf26342 gfc_generate_function_code(gfc_namespace*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans-decl.cc:7958 0xed9819 gfc_generate_code(gfc_namespace*) /mnt/work_store/gcc/gcc.test/gcc/fortran/trans.cc:2730 0xe544ee translate_all_program_units /mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7156 0xe54e23 gfc_parse_file() /mnt/work_store/gcc/gcc.test/gcc/fortran/parse.cc:7473 0xebf7ce gfc_be_parse_file /mnt/work_store/gcc/gcc.test/gcc/fortran/f95-lang.cc:241 Checking with git bisect this lead me to: d8ef4471cb9c9f86784b62424a215ea42173bfe1 being the last commit the test passed and 03623fa91ff36ecb9faa3b55f7842a39b759594e libstdc++: Use std::move for iterator in ranges::fill [PR117094] failing the test to pass. Can anyone confirm? I might be doing something wrong here, so please be patient and explain, what I miss. Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: [Ping*4, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]
Hi Paul, thank you for the review. Committed as: gcc-15-4329-gfd1a2f63bca No problem. I will persist :-) - Regards, Andre On Mon, 14 Oct 2024 11:27:58 +0100 Paul Richard Thomas wrote: > Hi Andre, > > This looks fine to me. OK for mainline. > > Thanks for the patch and sorry for the wait for review. > > Paul > > > On Mon, 14 Oct 2024 at 08:50, Andre Vehreschild wrote: > > > Ping ^ 4. > > > > Really no one to review this 160 something patch? > > > > Regtests ok on x86_64-pc-linux-gnu /Fedora 39? Ok for mainline? > > > > - Andre > > > > On Mon, 7 Oct 2024 12:52:29 +0200 > > Andre Vehreschild wrote: > > > > > Hi all, > > > > > > this patch somehow slipped my attention. Anyone for a review? Third time > > ping! > > > > > > Rebased to current mainline. > > > > > > Regtests ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > > > > > Regards, > > > Andre > > > > > > On Wed, 18 Sep 2024 12:30:23 +0200 > > > Andre Vehreschild wrote: > > > > > > > Hi all, > > > > > > > > back from my holidays and still no review. PING PING! > > > > > > > > Rebased to current mainline. > > > > > > > > Regtested ok on x86_64-pc-linux-gnu / F39. Ok for mainline? > > > > > > > > Regards, > > > > Andre > > > > > > > > On Wed, 21 Aug 2024 13:43:52 +0200 > > > > Andre Vehreschild wrote: > > > > > > > > > Hi all, > > > > > > > > > > pinging this patch for the first time. > > > > > > > > > > Rebased and regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for > > > > > mainline? > > > > > > > > > > - Andre > > > > > > > > > > On Thu, 15 Aug 2024 14:39:25 +0200 > > > > > Andre Vehreschild wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > attached patch fixes another regression on coarrays. This time for > > class > > > > > > typed coarrays as dummys. > > > > > > > > > > > > Regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? > > > > > > > > > > > > Regards, > > > > > > Andre > > > > > > -- > > > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > > > > > > > -- > > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > > > > -- > > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > > > > -- > > > Andre Vehreschild * Email: vehre ad gmx dot de > > > > > > -- > > Andre Vehreschild * Email: vehre ad gmx dot de > > -- Andre Vehreschild * Email: vehre ad gmx dot de
OpenACC 'nohost' clause: harmonize 'libgomp.oacc-{c-c++-common,fortran}/routine-nohost-1.*'
Hi! On 2021-07-22T00:20:13+0200, I wrote: > [...], I've now pushed "OpenACC 'nohost' clause" to > master branch in commit a61f6afbee370785cf091fe46e2e022748528307, [...] Via Tobias' recent commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11 "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device", I remembered this thing from three years ago: > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c > @@ -0,0 +1,63 @@ > +/* Test 'nohost' clause via 'acc_on_device'. > + > + With optimizations disabled, we currently don't expect that > 'acc_on_device' "evaluates at compile time to a constant". > + { dg-skip-if "TODO PR82391" { *-*-* } { "-O0" } } > +*/ > + > +/* { dg-additional-options "-fdump-tree-oaccdevlow" } */ > + > +/* { dg-additional-options "-fno-inline" } for stable results regarding > OpenACC 'routine'. */ > +[...] Here we do specify '-fno-inline'... > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 > @@ -0,0 +1,63 @@ > +! Test 'nohost' clause via 'acc_on_device'. > + > +! { dg-do run } > + > +! With optimizations disabled, we currently don't expect that > 'acc_on_device' "evaluates at compile time to a constant". > +! { dg-skip-if "TODO PR82391" { *-*-* } { "-O0" } } > + > +! { dg-additional-options "-fdump-tree-oaccdevlow" } > + > +program main > +[...] ..., but here we didn't. To address that, I've pushed to trunk branch commit de0320712d026a2d1eeb57aef277fa5a91808ac2 (HEAD, upstream/trunk) "OpenACC 'nohost' clause: harmonize 'libgomp.oacc-{c-c++-common,fortran}/routine-nohost-1.*'", see attached. Grüße Thomas >From de0320712d026a2d1eeb57aef277fa5a91808ac2 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 14 Oct 2024 14:38:13 +0200 Subject: [PATCH] OpenACC 'nohost' clause: harmonize 'libgomp.oacc-{c-c++-common,fortran}/routine-nohost-1.*' The test case 'libgomp.oacc-fortran/routine-nohost-1.f90' added in 2021 commit a61f6afbee370785cf091fe46e2e022748528307 "OpenACC 'nohost' clause" was dependend on inlining being enabled, and otherwise ('-fno-inline') failed to optimize/link: /tmp/ccb2hsPd.o: In function `MAIN__._omp_fn.0': routine-nohost-1.f90:(.text+0xf4): undefined reference to `fact_nohost_' However, as of recent commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11 "Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device", we're now properly handling OpenACC/Fortran 'acc_on_device', and may specify '-fno-inline', like done in 'libgomp.oacc-c-c++-common/routine-nohost-1.c'. libgomp/ * testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Add '-fno-inline'. --- libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 index b0537b8ff0b..e5f3e5740da 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90 @@ -7,6 +7,8 @@ ! { dg-additional-options "-fdump-tree-oaccloops" } +! { dg-additional-options "-fno-inline" } for stable results regarding OpenACC 'routine'. + program main use openacc implicit none -- 2.34.1
[PATCH 1/7] fortran: Add tests covering inline MINLOC/MAXLOC with DIM [PR90608]
From: Mikael Morin Checked on x86_64-pc-linux-gnu. OK for master? -- >8 -- Add the tests covering the cases for which the following patches will implement inline expansion of MINLOC and MAXLOC. Those are cases where the DIM argument is a constant value, and the ARRAY argument has rank greater than 1. PR fortran/90608 gcc/testsuite/ChangeLog: * gfortran.dg/ieee/maxloc_nan_2.f90: New test. * gfortran.dg/ieee/minloc_nan_2.f90: New test. * gfortran.dg/maxloc_with_dim_1.f90: New test. * gfortran.dg/maxloc_with_dim_and_mask_1.f90: New test. * gfortran.dg/minloc_with_dim_1.f90: New test. * gfortran.dg/minloc_with_dim_and_mask_1.f90: New test. --- .../gfortran.dg/ieee/maxloc_nan_2.f90 | 64 +++ .../gfortran.dg/ieee/minloc_nan_2.f90 | 64 +++ .../gfortran.dg/maxloc_with_dim_1.f90 | 201 .../maxloc_with_dim_and_mask_1.f90| 452 ++ .../gfortran.dg/minloc_with_dim_1.f90 | 201 .../minloc_with_dim_and_mask_1.f90| 452 ++ 6 files changed, 1434 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/ieee/maxloc_nan_2.f90 create mode 100644 gcc/testsuite/gfortran.dg/ieee/minloc_nan_2.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_dim_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_dim_and_mask_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_dim_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_dim_and_mask_1.f90 diff --git a/gcc/testsuite/gfortran.dg/ieee/maxloc_nan_2.f90 b/gcc/testsuite/gfortran.dg/ieee/maxloc_nan_2.f90 new file mode 100644 index 000..4d9b8707362 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/ieee/maxloc_nan_2.f90 @@ -0,0 +1,64 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline maxloc implementation, +! when the dim argument is present. + +program p + implicit none + call check_without_mask + call check_with_mask +contains + subroutine check_without_mask() +use, intrinsic :: ieee_arithmetic +real, allocatable :: a(:,:,:) +real :: nan +integer, allocatable :: r(:,:) +if (.not. ieee_support_nan(nan)) return +nan = ieee_value(nan, ieee_quiet_nan) +allocate(a(3,4,5), source = nan) +r = maxloc(a, dim=1) +if (any(shape(r) /= (/ 4, 5 /))) error stop 21 +if (any(r /= 1)) error stop 22 +r = maxloc(a, dim=2) +if (any(shape(r) /= (/ 3, 5 /))) error stop 23 +if (any(r /= 1)) error stop 24 +r = maxloc(a, dim=3) +if (any(shape(r) /= (/ 3, 4 /))) error stop 25 +if (any(r /= 1)) error stop 26 + end subroutine + subroutine check_with_mask() +use, intrinsic :: ieee_arithmetic +real, allocatable :: a(:,:,:) +logical, allocatable :: m(:,:,:) +real :: nan +integer, allocatable :: r(:,:) +if (.not. ieee_support_nan(nan)) return +nan = ieee_value(nan, ieee_quiet_nan) +allocate(a(2,3,4), source = nan) +allocate(m(2,3,4)) +m(:,:,:) = reshape((/ .false., .false., .true. , .true. , & + .false., .true. , .false., .false., & + .false., .true. , .true. , .false., & + .true. , .true. , .true. , .false., & + .false., .true. , .true. , .false., & + .false., .true. , .false., .false. /), shape(m)) +r = maxloc(a, dim = 1, mask = m) +if (any(shape(r) /= (/ 3, 4 /))) error stop 51 +if (any(r /= reshape((/ 0, 1, 2, & +0, 2, 1, & +1, 1, 2, & +1, 2, 0 /), (/ 3, 4 / error stop 52 +r = maxloc(a, dim = 2, mask = m) +if (any(shape(r) /= (/ 2, 4 /))) error stop 53 +if (any(r /= reshape((/ 2, 2, & +3, 2, & +1, 1, & +1, 2 /), (/ 2, 4 / error stop 54 +r = maxloc(a, dim = 3, mask = m) +if (any(shape(r) /= (/ 2, 3 /))) error stop 55 +if (any(r /= reshape((/ 3, 3, & +1, 1, & +2, 1 /), (/ 2, 3 / error stop 56 + end subroutine +end program p diff --git a/gcc/testsuite/gfortran.dg/ieee/minloc_nan_2.f90 b/gcc/testsuite/gfortran.dg/ieee/minloc_nan_2.f90 new file mode 100644 index 000..37c097a7acb --- /dev/null +++ b/gcc/testsuite/gfortran.dg/ieee/minloc_nan_2.f90 @@ -0,0 +1,64 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline minloc implementation, +! when the dim argument is present. + +program p + implicit none + call check_without_mask + call check_with_mask +contains + subroutine check_without_mask() +use, intrinsic :: ieee_arithmetic +real, allocatable :: a(:,:,:) +real :: nan +integer, allocatable :: r(:,:) +if (.not. ieee_support_nan(nan)) return +nan = ieee_value(nan, ieee_quiet_nan)
[PATCH 2/7] fortran: Inline unmasked integral MINLOC/MAXLOC with DIM [PR90608]
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable generation of inline code for the MINLOC and MAXLOC intrinsics, if the ARRAY argument is of integral type and of rank > 1 (only the rank 1 case was previously inlined), the DIM argument is a constant value and there is no MASK argument. The restriction to integral ARRAY and absent MASK limits the scope of the change to the cases where we generate single loop inline code. This change uses the existing scalarizer suport for reductions, that is arrays used in scalarization loops, where each element uses a nested scalarization loop to calculate its value. The nested loop (and respectively the nested scalarization chain) is created while walking the MINLOC/MAXLOC expression, it's set up automatically at the time the outer loop is set up, and gfc_conv_intrinsic_minmaxloc is changed to use it as a replacement for the local loop variable (respectively ARRAY scalarization chain) used in the non-reduction case (i.e. when DIM is absent). PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Return true if DIM is constant, ARRAY is integral and MASK is absent. (walk_inline_intrinsic_minmaxloc): If DIM is present, walk ARRAY and move the dimension corresponding to DIM to a nested chain, keeping the rest of the dimensions as the returned scalarization chain. (gfc_conv_intrinsic_minmaxloc): When inside the scalarization loops, proceed with inline code generation If DIM is present. If DIM is present, skip result array creation and final initialization from individual result local variables. If DIM is present and ARRAY has rank greater than 1, use the nested loop initialized by the scalarizer instead of the local one, use 1 as scalarization dimension, and evaluate ARRAY using the inherited scalarization chain instead of creating a local one by walking the expression. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_bounds_1.f90: Also accept the error message generated by the scalarizer in case the function call is implemented through inline code. * gfortran.dg/maxloc_bounds_2.f90: Likewise. * gfortran.dg/maxloc_bounds_3.f90: Likewise. * gfortran.dg/minmaxloc_19.f90: New test. --- gcc/fortran/trans-intrinsic.cc| 229 -- gcc/testsuite/gfortran.dg/maxloc_bounds_1.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_2.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_3.f90 | 4 +- gcc/testsuite/gfortran.dg/minmaxloc_19.f90| 182 ++ 5 files changed, 344 insertions(+), 79 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_19.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index a282ae1c090..e44a245ec75 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5472,12 +5472,14 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) tree lab1, lab2; tree b_if, b_else; tree back; - gfc_loopinfo loop; - gfc_actual_arglist *actual; - gfc_ss *arrayss; - gfc_ss *maskss; + gfc_loopinfo loop, *ploop; + gfc_actual_arglist *actual, *array_arg, *dim_arg, *mask_arg, *kind_arg; + gfc_actual_arglist *back_arg; + gfc_ss *arrayss = nullptr; + gfc_ss *maskss = nullptr; gfc_se arrayse; gfc_se maskse; + gfc_se *base_se; gfc_expr *arrayexpr; gfc_expr *maskexpr; gfc_expr *backexpr; @@ -5489,6 +5491,14 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) bool optional_mask; actual = expr->value.function.actual; + array_arg = actual; + dim_arg = array_arg->next; + mask_arg = dim_arg->next; + kind_arg = mask_arg->next; + back_arg = kind_arg->next; + + bool dim_present = dim_arg->expr != nullptr; + bool nested_loop = dim_present && expr->rank > 0; /* The last argument, BACK, is passed by value. Ensure that by setting its name to %VAL. */ @@ -5502,11 +5512,15 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) { if (se->ss->info->useflags) { - /* The inline implementation of MINLOC/MAXLOC has been generated -before, out of the scalarization loop; now we can just use the -result. */ - gfc_conv_tmp_array_ref (se); - return; + if (!dim_present || !gfc_inline_intrinsic_function_p (expr)) + { + /* The code generating and initializing the result array has been +generated already before the scalarization loop, either with a +library function call or with inline code; now we can just use +the result. */ + gfc_conv_tmp_array_ref (se); + return; + } } else if (!gfc_inline_intrin
[PATCH 0/7] fortran: Inline MINLOC/MAXLOC with DIM [PR90608]
From: Mikael Morin Hello, this is the second (and last) series of patches to inline MINLOC and MAXLOC. The previous series added support for inlining without DIM. This one focuses on the cases where the DIM argument is present (and is a constant), using the existing support for reduction functions in the scalarizer. The changes involve the creation of a nested loop at expression walking time, and its usage in gfc_conv_intrinsic_minmaxloc as a replacement of the local loop variable used when DIM is absent, the existing scalarizer code taking care of its full setup in between. The series is split similarly to the previous one, with the unmasked integral cases being inlined first, followed by scalar MASK cases, and finally REAL and masked cases. Mikael Morin (7): fortran: Add tests covering inline MINLOC/MAXLOC with DIM [PR90608] fortran: Inline unmasked integral MINLOC/MAXLOC with DIM [PR90608] fortran: Inline MINLOC/MAXLOC with DIM and scalar MASK [PR90608] fortran: Check MASK directly instead of its scalarization chain fortran: Inline non-character MINLOC/MAXLOC with DIM [PR90608] fortran: Check for empty MINLOC/MAXLOC ARRAY along DIM only fortran: Evaluate once BACK argument of MINLOC/MAXLOC with DIM [PR90608] gcc/fortran/trans-intrinsic.cc| 362 +++ .../gfortran.dg/ieee/maxloc_nan_2.f90 | 64 ++ .../gfortran.dg/ieee/minloc_nan_2.f90 | 64 ++ gcc/testsuite/gfortran.dg/maxloc_8.f90| 349 +++ gcc/testsuite/gfortran.dg/maxloc_bounds_1.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_2.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_3.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_8.f90 | 4 +- .../gfortran.dg/maxloc_with_dim_1.f90 | 201 ++ .../maxloc_with_dim_and_mask_1.f90| 452 ++ gcc/testsuite/gfortran.dg/minloc_9.f90| 349 +++ .../gfortran.dg/minloc_with_dim_1.f90 | 201 ++ .../minloc_with_dim_and_mask_1.f90| 452 ++ gcc/testsuite/gfortran.dg/minmaxloc_19.f90| 182 ++ gcc/testsuite/gfortran.dg/minmaxloc_20.f90| 182 ++ gcc/testsuite/gfortran.dg/minmaxloc_21.f90| 572 ++ gcc/testsuite/gfortran.dg/minmaxloc_22.f90| 26 + 17 files changed, 3354 insertions(+), 118 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/ieee/maxloc_nan_2.f90 create mode 100644 gcc/testsuite/gfortran.dg/ieee/minloc_nan_2.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_8.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_dim_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_dim_and_mask_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_9.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_dim_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_dim_and_mask_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_19.f90 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_20.f90 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_21.f90 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_22.f90 -- 2.45.2
[PATCH 5/7] fortran: Inline non-character MINLOC/MAXLOC with DIM [PR90608]
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable generation of inline MINLOC/MAXLOC code in the cases where DIM is a constant, and either ARRAY is of REAL type or MASK is an array. Those cases are the remaining bits to fully support inlining of non-CHARACTER MINLOC/MAXLOC with constant DIM. They are treated together because they generate similar code, the NANs for REAL types being handled a bit like a second level of masking. These are the cases for which we generate two loops. This change affects the code generating the second loop, that was previously accessible only in cases ARRAY had rank 1. The main changes are in gfc_conv_intrinsic_minmaxloc the replacement of the locally initialized scalarization loop with the one provided and previously initialized by the scalarizer. Same goes for the locally initialized MASK scalarizer chain. As this is enabling the code generating a second loop in a context of reduction and nested loops, care is taken not to advance the parent scalarization chain twice. The scalarization chain element(s) for an array MASK are inserted in the chain at a different place from that of a scalar MASK. This is done on purpose to match the code consuming the chains which are in different places for scalar and array MASK. PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Return TRUE for MINLOC/MAXLOC with constant DIM and either REAL ARRAY or non-scalar MASK. (walk_inline_intrinsic_minmaxloc): Walk MASK and if it's an array add the chain obtained before that of ARRAY. (gfc_conv_intrinsic_minmaxloc): Use the nested loop if there is one. To evaluate MASK (respectively ARRAY in the second loop), inherit the scalarizer chain if in a nested loop, otherwise keep using the chain obtained by walking MASK (respectively ARRAY). If there is a nested loop, avoid advancing the parent scalarization chain a second time in the second loop. gcc/testsuite/ChangeLog: * gfortran.dg/minmaxloc_21.f90: New test. --- gcc/fortran/trans-intrinsic.cc | 96 ++-- gcc/testsuite/gfortran.dg/minmaxloc_21.f90 | 572 + 2 files changed, 625 insertions(+), 43 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_21.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 0ac3d7fe3a1..b77183ab11e 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5477,6 +5477,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_actual_arglist *back_arg; gfc_ss *arrayss = nullptr; gfc_ss *maskss = nullptr; + gfc_ss *orig_ss = nullptr; gfc_se arrayse; gfc_se maskse; gfc_se nested_se; @@ -5711,6 +5712,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (nested_loop) { ploop = enter_nested_loop (&nested_se); + orig_ss = nested_se.ss; ploop->temp_dim = 1; } else @@ -5785,9 +5787,8 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } else { - gcc_assert (!nested_loop); - for (int i = 0; i < loop.dimen; i++) - gfc_add_modify (&loop.pre, pos[i], gfc_index_zero_node); + for (int i = 0; i < ploop->dimen; i++) + gfc_add_modify (&ploop->pre, pos[i], gfc_index_zero_node); lab1 = gfc_build_label_decl (NULL_TREE); TREE_USED (lab1) = 1; lab2 = gfc_build_label_decl (NULL_TREE); @@ -5818,10 +5819,10 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* If we have a mask, only check this element if the mask is set. */ if (maskexpr && maskexpr->rank > 0) { - gcc_assert (!nested_loop); - gfc_init_se (&maskse, NULL); - gfc_copy_loopinfo_to_se (&maskse, &loop); - maskse.ss = maskss; + gfc_init_se (&maskse, base_se); + gfc_copy_loopinfo_to_se (&maskse, ploop); + if (!nested_loop) + maskse.ss = maskss; gfc_conv_expr_val (&maskse, maskexpr); gfc_add_block_to_block (&body, &maskse.pre); @@ -5849,13 +5850,11 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) stmtblock_t ifblock2; tree ifbody2; - gcc_assert (!nested_loop); - gfc_start_block (&ifblock2); - for (int i = 0; i < loop.dimen; i++) + for (int i = 0; i < ploop->dimen; i++) { tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[i]), -loop.loopvar[i], offset[i]); +ploop->loopvar[i], offset[i]); gfc_add_modify (&ifblock2, pos[i], tmp); } ifbody2 = gfc_finish_block (&ifblock2); @@ -5939,17 +5938,24 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op)
[PATCH 7/7] fortran: Evaluate once BACK argument of MINLOC/MAXLOC with DIM [PR90608]
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Evaluate the BACK argument of MINLOC/MAXLOC once before the scalarization loops in the case where the DIM argument is present. This is a follow-up to r15-1994-ga55d24b3cf7f4d07492bb8e6fcee557175b47ea3 which added knowledge of BACK to the scalarizer, to r15-2701-ga10436a8404ad2f0cc5aa4d6a0cc850abe5ef49e which removed it to handle it out of scalarization instead, and to more immediate previous patches that added inlining support for MINLOC/MAXLOC with DIM. The inlining support for MINLOC/MAXLOC with DIM introduced nested loops, which made the evaluation of BACK (removed from the scalarizer knowledge by the forementionned commit) wrapped in a loop, so possibly executed more than once. This change adds BACK to the scalarization chain if MINLOC/MAXLOC will use nested loops, so that it is evaluated by the scalarizer only once before the outermost loop in that case. PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (walk_inline_intrinsic_minmaxloc): Add a scalar element for BACK as first item of the chain if BACK is present and there will be nested loops. (gfc_conv_intrinsic_minmaxloc): Evaluate BACK using an inherited scalarization chain if there is a nested loop. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_8.f90: New test. * gfortran.dg/minloc_9.f90: New test. --- gcc/fortran/trans-intrinsic.cc | 21 +- gcc/testsuite/gfortran.dg/maxloc_8.f90 | 349 + gcc/testsuite/gfortran.dg/minloc_9.f90 | 349 + 3 files changed, 717 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/maxloc_8.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_9.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index e1a5fdef26c..4cd4f4b1977 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5594,7 +5594,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) && maskexpr->symtree->n.sym->attr.optional; backexpr = back_arg->expr; - gfc_init_se (&backse, NULL); + gfc_init_se (&backse, nested_loop ? se : nullptr); if (backexpr == nullptr) back = logical_false_node; else if (maybe_absent_optional_variable (backexpr)) @@ -11885,10 +11885,13 @@ walk_inline_intrinsic_minmaxloc (gfc_ss *ss, gfc_expr *expr ATTRIBUTE_UNUSED) gfc_actual_arglist *array_arg = expr->value.function.actual; gfc_actual_arglist *dim_arg = array_arg->next; gfc_actual_arglist *mask_arg = dim_arg->next; + gfc_actual_arglist *kind_arg = mask_arg->next; + gfc_actual_arglist *back_arg = kind_arg->next; gfc_expr *array = array_arg->expr; gfc_expr *dim = dim_arg->expr; gfc_expr *mask = mask_arg->expr; + gfc_expr *back = back_arg->expr; if (dim == nullptr) return gfc_get_array_ss (ss, expr, 1, GFC_SS_INTRINSIC); @@ -11916,7 +11919,21 @@ walk_inline_intrinsic_minmaxloc (gfc_ss *ss, gfc_expr *expr ATTRIBUTE_UNUSED) chain, "hiding" that dimension from the outer scalarization. */ int dim_val = mpz_get_si (dim->value.integer); gfc_ss *tail = nest_loop_dimension (tmp_ss, dim_val - 1); - tail->next = ss; + + if (back && array->rank > 1) +{ + /* If there are nested scalarization loops, include BACK in the +scalarization chains to avoid evaluating it multiple times in a loop. +Otherwise, prefer to handle it outside of scalarization. */ + gfc_ss *back_ss = gfc_get_scalar_ss (ss, back); + back_ss->info->type = GFC_SS_REFERENCE; + if (maybe_absent_optional_variable (back)) + back_ss->info->can_be_null_ref = true; + + tail->next = back_ss; +} + else +tail->next = ss; if (scalar_mask) { diff --git a/gcc/testsuite/gfortran.dg/maxloc_8.f90 b/gcc/testsuite/gfortran.dg/maxloc_8.f90 new file mode 100644 index 000..20f63a84bbe --- /dev/null +++ b/gcc/testsuite/gfortran.dg/maxloc_8.f90 @@ -0,0 +1,349 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check that the evaluation of MAXLOC's BACK argument is made only once +! before the scalarization loops, when the DIM argument is present. + +program p + implicit none + integer, parameter :: data60(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4, 5, 5, & + 8, 2, 6, 7, 8, 7, 4, 5, 3, 9, & + 0, 6, 4, 5, 5, 8, 2, 6, 7, 8, & + 7, 4, 5, 3, 9, 0, 6, 4, 5, 5, & + 8, 2, 6, 7, 8, 7, 4, 5, 3, 9, & + 0, 6, 4, 5, 5, 8, 2, 6, 7, 8 /) + logical, parameter :: mask60(*) = (/ .true. , .false., .false., .false., & + .true. , .false., .true. , .false., & + .false., .true. , .true. , .false., & +
[PATCH 4/7] fortran: Check MASK directly instead of its scalarization chain
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Update the conditions used by the inline MINLOC/MAXLOC code generation function to check directly the properties of MASK instead of the variable holding its scalarization chain. The inline implementation of MINLOC/MAXLOC in gfc_conv_intrinsic_minmaxloc uses several conditions checking the presence of a scalarization chain for MASK, which means that the argument is present and non-scalar. The next patch will allow inlining MINLOC/MAXLOC with DIM and MASK, and in that case the scalarization chain for MASK is initialized elsewhere, so the variable usually holding it in the function is not used, and the conditions won't work in that case. This change updates the conditions to check directly the properties of MASK so that they work even if the scalarization chain variable is not used. gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Use conditionals based on the MASK expression rather than on its scalarization chains. --- gcc/fortran/trans-intrinsic.cc | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 29f17f334a3..0ac3d7fe3a1 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5746,7 +5746,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gcc_assert (reduction_dimensions == ploop->dimen); - if (nonempty == NULL && maskss == NULL) + if (nonempty == NULL && !(maskexpr && maskexpr->rank > 0)) { nonempty = logical_true_node; @@ -5816,7 +5816,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_start_scalarized_body (ploop, &body); /* If we have a mask, only check this element if the mask is set. */ - if (maskss) + if (maskexpr && maskexpr->rank > 0) { gcc_assert (!nested_loop); gfc_init_se (&maskse, NULL); @@ -5921,7 +5921,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } gfc_add_expr_to_block (&block, ifbody); - if (maskss) + if (maskexpr && maskexpr->rank > 0) { /* We enclose the above in if (mask) {...}. If the mask is an optional argument, generate IF (.NOT. PRESENT(MASK) @@ -5972,7 +5972,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_add_expr_to_block (outer_block, build1_v (LABEL_EXPR, lab1)); /* If we have a mask, only check this element if the mask is set. */ - if (maskss) + if (maskexpr && maskexpr->rank > 0) { gfc_init_se (&maskse, NULL); gfc_copy_loopinfo_to_se (&maskse, &loop); @@ -6038,7 +6038,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_add_expr_to_block (&block, tmp); - if (maskss) + if (maskexpr && maskexpr->rank > 0) { /* We enclose the above in if (mask) {...}. If the mask is an optional argument, generate IF (.NOT. PRESENT(MASK) @@ -6063,7 +6063,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_add_expr_to_block (&loop.pre, build1_v (LABEL_EXPR, lab2)); /* For a scalar mask, enclose the loop in an if statement. */ - if (maskexpr && maskss == NULL) + if (maskexpr && maskexpr->rank == 0) { tree ifmask; -- 2.45.2
[PATCH 6/7] fortran: Check for empty MINLOC/MAXLOC ARRAY along DIM only
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- In the function generating inline code to implement MINLOC and MAXLOC, only check for ARRAY size along DIM if DIM is present. The check for ARRAY emptyness had been checking the size of the full array, which is correct for MINLOC and MAXLOC without DIM. But if DIM is present, the reduction is along DIM only so the check for emptyness should consider that dimension only as well. This sounds like a correctness issue, but fortunately the cases where it makes a difference are cases where ARRAY is empty, so even if the value calculated for MINLOC or MAXLOC is wrong, it's wrapped in a zero iteration loop, and the wrong values are not actually used. In the end this just avoids unnecessary calculations. A previous version of this patch regressed on non-constant DIM with rank 1 ARRAY. The new testcase checks that that case is supported. gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Only get the size along DIM instead of the full size if DIM is present. gcc/testsuite/ChangeLog: * gfortran.dg/minmaxloc_22.f90: New test. --- gcc/fortran/trans-intrinsic.cc | 19 +++- gcc/testsuite/gfortran.dg/minmaxloc_22.f90 | 26 ++ 2 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_22.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index b77183ab11e..e1a5fdef26c 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5641,7 +5641,24 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (!(maskexpr && maskexpr->rank > 0)) { mpz_t asize; - if (gfc_array_size (arrayexpr, &asize)) + bool reduction_size_known; + + if (dim_present) + { + int reduction_dim; + if (dim_arg->expr->expr_type == EXPR_CONSTANT) + reduction_dim = mpz_get_si (dim_arg->expr->value.integer) - 1; + else if (arrayexpr->rank == 1) + reduction_dim = 0; + else + gcc_unreachable (); + reduction_size_known = gfc_array_dimen_size (arrayexpr, reduction_dim, + &asize); + } + else + reduction_size_known = gfc_array_size (arrayexpr, &asize); + + if (reduction_size_known) { nonempty = gfc_conv_mpz_to_tree (asize, gfc_index_integer_kind); mpz_clear (asize); diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_22.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_22.f90 new file mode 100644 index 000..ec97d1435af --- /dev/null +++ b/gcc/testsuite/gfortran.dg/minmaxloc_22.f90 @@ -0,0 +1,26 @@ +! { dg-do compile } +! { dg-additional-options "-O" } +! +! Check that the inline code generated for MINLOC and MAXLOC supports +! a non-constant DIM argument if ARRAY has rank 1. + +program p + implicit none + integer, parameter :: n = 5 + integer :: a(n), i + a = (/ (i**2, i=1,n) /) + print *, f(a, 1), g(a, 1) +contains + function f(a, d) +integer :: a(n) +integer :: d +integer :: f +f = minloc(a, dim=d) + end function + function g(a, d) +integer :: a(n) +integer :: d +integer :: g +g = maxloc(a, dim=d) + end function +end program p -- 2.45.2
[PATCH 3/7] fortran: Inline MINLOC/MAXLOC with DIM and scalar MASK [PR90608]
From: Mikael Morin Bootstrapped and regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable the generation of inline code for MINLOC/MAXLOC when argument ARRAY is of integral type and has rank > 1, DIM is a constant, and MASK is scalar (only absent MASK or rank 1 ARRAY were inlined before). Scalar masks are implemented with a wrapping condition around the code one would generate if MASK wasn't present, so they are easy to support once inline code without MASK is working. With this change, there are both expressions evaluated inside the nested loop (ARRAY, and in the future MASK if non-scalar) and expressions evaluated outside of it (MASK if scalar). For both one has to advance the scalarization chain passed as argument SE to gfc_conv_intrinsic_minmaxloc as they are evaluated, but for expressions evaluated from within the nested loop one has to advance additionally the nested scalarization chain of the reduction loop. This is normally handled transparently through the inheritance that is defined when initializing gfc_se structs, but there has to be some variable to inherit from, and there is a single one, SE. This variable is kept as base for out of nested loop expressions only (i.e. for scalar MASK), and this change introduces a new variable to hold the current advance of the nested loop scalarization chain and serve as inheritance base to evaluate nested loop expressions (just ARRAY for now, additionally non-scalar MASK later). PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Return TRUE if MASK is scalar. (walk_inline_intrinsic_minmaxloc): Append to the scalarization chain a scalar element for MASK if it's present. (gfc_conv_intrinsic_minmaxloc): Use a local gfc_se struct to serve as base for all the expressions evaluated in the nested loop. To evaluate MASK when there is a nested loop, enable usage of the scalarizer and set the original passed in SE argument as current scalarization chain element to use. And use the nested loop from the scalarizer instead of the local loop in that case. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_bounds_8.f90: Accept the error message generated by the scalarizer in case the MAXLOC intrinsic call is implemented through inline code. * gfortran.dg/minmaxloc_20.f90: New test. --- gcc/fortran/trans-intrinsic.cc| 35 +++- gcc/testsuite/gfortran.dg/maxloc_bounds_8.f90 | 4 +- gcc/testsuite/gfortran.dg/minmaxloc_20.f90| 182 ++ 3 files changed, 209 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_20.f90 diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index e44a245ec75..29f17f334a3 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5479,6 +5479,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_ss *maskss = nullptr; gfc_se arrayse; gfc_se maskse; + gfc_se nested_se; gfc_se *base_se; gfc_expr *arrayexpr; gfc_expr *maskexpr; @@ -5616,7 +5617,10 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_add_block_to_block (&se->pre, &backse.post); if (nested_loop) -base_se = se; +{ + gfc_init_se (&nested_se, se); + base_se = &nested_se; +} else { /* Walk the arguments. */ @@ -5706,7 +5710,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (nested_loop) { - ploop = enter_nested_loop (se); + ploop = enter_nested_loop (&nested_se); ploop->temp_dim = 1; } else @@ -6063,21 +6067,19 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) { tree ifmask; - gcc_assert (!nested_loop); - - gfc_init_se (&maskse, NULL); + gfc_init_se (&maskse, nested_loop ? se : nullptr); gfc_conv_expr_val (&maskse, maskexpr); gfc_add_block_to_block (&se->pre, &maskse.pre); gfc_init_block (&block); - gfc_add_block_to_block (&block, &loop.pre); - gfc_add_block_to_block (&block, &loop.post); + gfc_add_block_to_block (&block, &ploop->pre); + gfc_add_block_to_block (&block, &ploop->post); tmp = gfc_finish_block (&block); /* For the else part of the scalar mask, just initialize the pos variable the same way as above. */ gfc_init_block (&elseblock); - for (int i = 0; i < loop.dimen; i++) + for (int i = 0; i < ploop->dimen; i++) gfc_add_modify (&elseblock, pos[i], gfc_index_zero_node); elsetmp = gfc_finish_block (&elseblock); ifmask = conv_mask_condition (&maskse, maskexpr, optional_mask); @@ -11857,9 +11859,11 @@ walk_inline_intrinsic_minmaxloc (gfc_ss *ss, gfc_expr *expr ATTRIBUTE_UNUSED) gfc_actual_arglist *arra