Hi Mikael,
thanks for this nice set of patches!
I've played around a bit, and it seems to look good.
I have only minor comments left (besides the nan issue raised):
- inline expansion is inhibited at -Os. But wouldn't it be good if
we make this expansion also dependent on -ffrontend-optimize?
(This was the case for rank-1 before your patch).
- in the case where two sets of loop(nest)s are generated, i.e. for
non-integral argument x, I wonder if the predictors for conditionals
(-> _likely/_unlikely) are chosen optimally. E.g. for this code:
subroutine minloc_real (x, m, back)
implicit none
real, intent(in), contiguous :: x(:)
integer :: m(*)
logical, optional :: back
m(1:rank(x)) = minloc (x,back=back)
end
the first loop becomes:
S.10 = second_loop_entry.9 ? idx0.7 : 1;
while (1)
{
if (S.10 > D.4310) goto L.3;
if (__builtin_expect ((integer(kind=8)) ((*x.0)[S.10 * D.4314
+ D.4309] <= limit.8), 0, 8))
{
limit.8 = (*x.0)[S.10 * D.4314 + D.4309];
pos0.5 = S.10 + offset0.6;
idx0.7 = S.10;
second_loop_entry.9 = 1;
goto L.1;
}
S.10 = S.10 + 1;
}
This results from this code in trans-intrinsic.cc:
if (!lab1 || HONOR_NANS (DECL_MODE (limit)))
{
...
cond = gfc_unlikely (cond, PRED_BUILTIN_EXPECT);
ifbody = build3_v (COND_EXPR, cond, ifbody,
build_empty_stmt (input_location));
}
As the reason for this separate loop is finding a first non-nan value,
I would expect gfc_likely to be more reasonable for the common case.
(There is also the oddity S.10 = second_loop_entry.9 ? ..., where
idx0.7 seems to be not initialized, but luckily it seems to be
handled by the optimizer and seen that this is no problem.)
Having gfc_unlikely in the second set of loops is fine, as this passes
over all array elements.
Note that this is pre-existing without/before your patch, but since you
are at it, you may want to check.
Otherwise this is fine for mainline with the said issues considered.
Thanks for the patch-set!
Harald
Am 31.07.24 um 22:07 schrieb Mikael Morin:
From: Mikael Morin <mik...@gcc.gnu.org>
This series of patches enable the generation of inline code for the MINLOC
and MAXLOC intrinsics, when the DIM argument is not present. The
generated code is based on the inline implementation already generated in
the scalar case, that is when ARRAY has rank 1 and DIM is present. The
code is extended by using several variables (one for each dimension) where
the scalar code used just one, and collecting the variables to an array
before returning.
The patches are split in a way that allows inlining in more and more cases
as controlled by the gfc_inline_intrinsic_p predicate which evolves with
the patches.
They have been generated on top of the patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657959.html
Mikael Morin (8):
fortran: Add tests covering inline MINLOC/MAXLOC without DIM [PR90608]
fortran: Disable frontend passes for inlinable MINLOC/MAXLOC [PR90608]
fortran: Inline MINLOC/MAXLOC with no DIM and ARRAY of rank 1
[PR90608]
fortran: Outline array bound check generation code
fortran: Inline integral MINLOC/MAXLOC with no DIM and no MASK
[PR90608]
fortran: Inline integral MINLOC/MAXLOC with no DIM and scalar MASK
[PR90608]
fortran: Inline non-character MINLOC/MAXLOC with no DIM [PR90608]
fortran: Continue MINLOC/MAXLOC second loop where the first stopped
[PR90608]
gcc/fortran/frontend-passes.cc | 3 +-
gcc/fortran/trans-array.cc | 382 ++++++++-------
gcc/fortran/trans-intrinsic.cc | 454 +++++++++++++-----
gcc/testsuite/gfortran.dg/maxloc_7.f90 | 220 +++++++++
gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 | 4 +-
gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 | 4 +-
gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 | 4 +-
gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 | 4 +-
.../gfortran.dg/maxloc_with_mask_1.f90 | 393 +++++++++++++++
gcc/testsuite/gfortran.dg/minloc_8.f90 | 220 +++++++++
.../gfortran.dg/minloc_with_mask_1.f90 | 392 +++++++++++++++
11 files changed, 1792 insertions(+), 288 deletions(-)
create mode 100644 gcc/testsuite/gfortran.dg/maxloc_7.f90
create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90
create mode 100644 gcc/testsuite/gfortran.dg/minloc_8.f90
create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90