Hi Mikael, thanks for this nice set of patches!
I've played around a bit, and it seems to look good. I have only minor comments left (besides the nan issue raised): - inline expansion is inhibited at -Os. But wouldn't it be good if we make this expansion also dependent on -ffrontend-optimize? (This was the case for rank-1 before your patch). - in the case where two sets of loop(nest)s are generated, i.e. for non-integral argument x, I wonder if the predictors for conditionals (-> _likely/_unlikely) are chosen optimally. E.g. for this code: subroutine minloc_real (x, m, back) implicit none real, intent(in), contiguous :: x(:) integer :: m(*) logical, optional :: back m(1:rank(x)) = minloc (x,back=back) end the first loop becomes: S.10 = second_loop_entry.9 ? idx0.7 : 1; while (1) { if (S.10 > D.4310) goto L.3; if (__builtin_expect ((integer(kind=8)) ((*x.0)[S.10 * D.4314 + D.4309] <= limit.8), 0, 8)) { limit.8 = (*x.0)[S.10 * D.4314 + D.4309]; pos0.5 = S.10 + offset0.6; idx0.7 = S.10; second_loop_entry.9 = 1; goto L.1; } S.10 = S.10 + 1; } This results from this code in trans-intrinsic.cc: if (!lab1 || HONOR_NANS (DECL_MODE (limit))) { ... cond = gfc_unlikely (cond, PRED_BUILTIN_EXPECT); ifbody = build3_v (COND_EXPR, cond, ifbody, build_empty_stmt (input_location)); } As the reason for this separate loop is finding a first non-nan value, I would expect gfc_likely to be more reasonable for the common case. (There is also the oddity S.10 = second_loop_entry.9 ? ..., where idx0.7 seems to be not initialized, but luckily it seems to be handled by the optimizer and seen that this is no problem.) Having gfc_unlikely in the second set of loops is fine, as this passes over all array elements. Note that this is pre-existing without/before your patch, but since you are at it, you may want to check. Otherwise this is fine for mainline with the said issues considered. Thanks for the patch-set! Harald Am 31.07.24 um 22:07 schrieb Mikael Morin:
From: Mikael Morin <mik...@gcc.gnu.org> This series of patches enable the generation of inline code for the MINLOC and MAXLOC intrinsics, when the DIM argument is not present. The generated code is based on the inline implementation already generated in the scalar case, that is when ARRAY has rank 1 and DIM is present. The code is extended by using several variables (one for each dimension) where the scalar code used just one, and collecting the variables to an array before returning. The patches are split in a way that allows inlining in more and more cases as controlled by the gfc_inline_intrinsic_p predicate which evolves with the patches. They have been generated on top of the patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657959.html Mikael Morin (8): fortran: Add tests covering inline MINLOC/MAXLOC without DIM [PR90608] fortran: Disable frontend passes for inlinable MINLOC/MAXLOC [PR90608] fortran: Inline MINLOC/MAXLOC with no DIM and ARRAY of rank 1 [PR90608] fortran: Outline array bound check generation code fortran: Inline integral MINLOC/MAXLOC with no DIM and no MASK [PR90608] fortran: Inline integral MINLOC/MAXLOC with no DIM and scalar MASK [PR90608] fortran: Inline non-character MINLOC/MAXLOC with no DIM [PR90608] fortran: Continue MINLOC/MAXLOC second loop where the first stopped [PR90608] gcc/fortran/frontend-passes.cc | 3 +- gcc/fortran/trans-array.cc | 382 ++++++++------- gcc/fortran/trans-intrinsic.cc | 454 +++++++++++++----- gcc/testsuite/gfortran.dg/maxloc_7.f90 | 220 +++++++++ gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 | 4 +- .../gfortran.dg/maxloc_with_mask_1.f90 | 393 +++++++++++++++ gcc/testsuite/gfortran.dg/minloc_8.f90 | 220 +++++++++ .../gfortran.dg/minloc_with_mask_1.f90 | 392 +++++++++++++++ 11 files changed, 1792 insertions(+), 288 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/maxloc_7.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_8.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90