maxloc calls

cvs-commit at gcc dot gnu.org via Gcc-bugs Sat, 21 Sep 2024 09:36:23 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608


--- Comment #31 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Mikael Morin <mik...@gcc.gnu.org>:

https://gcc.gnu.org/g:3c01ddc4ff0fdbaf32c22aed1c04d1d587821d91

commit r15-3764-g3c01ddc4ff0fdbaf32c22aed1c04d1d587821d91
Author: Mikael Morin <mik...@gcc.gnu.org>
Date:   Sat Sep 21 18:33:04 2024 +0200

    fortran: Continue MINLOC/MAXLOC second loop where the first stopped
[PR90608]

    Continue the second set of loops where the first one stopped in the
    generated inline MINLOC/MAXLOC code in the cases where the generated code
    contains two sets of loops.  This fixes a regression that was introduced
    when enabling the generation of inline MINLOC/MAXLOC code with ARRAY of
rank
    greater than 1, no DIM argument, and either non-scalar MASK or floating-
    point ARRAY.

    In the cases where two sets of loops are generated as inline MINLOC/MAXLOC
    code, we previously generated code such as (for rank 2 ARRAY, so with two
    levels of nesting):

            for (idx11 in lower1..upper1)
              {
                for (idx12 in lower2..upper2)
                  {
                    ...
                    if (...)
                      {
                        ...
                        goto second_loop;
                      }
                  }
              }
            second_loop:
            for (idx21 in lower1..upper1)
              {
                for (idx22 in lower2..upper2)
                  {
                    ...
                  }
              }

    which means we process the first elements twice, once in the first set
    of loops and once in the second one.  This change avoids this duplicate
    processing by using a conditional as lower bound for the second set of
    loops, generating code like:

            second_loop_entry = false;
            for (idx11 in lower1..upper1)
              {
                for (idx12 in lower2..upper2)
                  {
                    ...
                    if (...)
                      {
                        ...
                        second_loop_entry = true;
                        goto second_loop;
                      }
                  }
              }
            second_loop:
            for (idx21 in (second_loop_entry ? idx11 : lower1)..upper1)
              {
                for (idx22 in (second_loop_entry ? idx12 : lower2)..upper2)
                  {
                    ...
                    second_loop_entry = false;
                  }
              }

    It was expected that the compiler optimizations would be able to remove the
    state variable second_loop_entry.  It is the case if ARRAY has rank 1 (so
    without loop nesting), the variable is removed and the loop bounds become
    unconditional, which restores previously generated code, fully fixing the
    regression.  For larger rank, unfortunately, the state variable and
    conditional loop bounds remain, but those cases were previously using
    library calls, so it's not a regression.

            PR fortran/90608

    gcc/fortran/ChangeLog:

            * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Generate a set
            of index variables.  Set them using the loop indexes before leaving
            the first set of loops.  Generate a new loop entry predicate.
            Initialize it.  Set it before leaving the first set of loops. 
Clear
            it in the body of the second set of loops.  For the second set of
            loops, update each loop lower bound to use the corresponding index
            variable if the predicate variable is set.

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

Reply via email to