Hello,

Le 19/08/2024 à 21:44, Harald Anlauf a écrit :
Hi Mikael,

apart from patch #04/10, which did not apply cleanly here, I was
able to test your patch.  It seems to work with a manual workaround
(-fno-frontend-optimize) to work around this problem.
Might be a local issue...

Huh? That's unexpected, patches were rebased before submitting, and I don't think there was any recent activity in that area of the compiler anyway.

That said, it works as advertised.  Thanks for separating out the
IEEE-NaN tests.

What I did not fully get is the way you deal with -finline-intrinsics= .
Currently there are only MINLOC and MAXLOC, but in the future there
could be many more intrinsics.  Given that you need 2 bits per
intrinsic in flag_inline_intrinsics, how future-proof is that?

Well, I don't expect that many candidate intrinsics for the flags; currently SUM and PRODUCT could be added, and probably IALL, IANY and IPARITY as well. Remember that having both a libgfortran and a frontend implementation is a prerequisite.

For the future, 2 bits gives room to 16 intrinsics, and if we extend to 64 bits, to 32 intrinsics without much hassle. Having only 1 bit per intrinsic would be certainly more future proof but I haven't found how to do it. Zero bit (aka no per-intrinsic setting) would be even more future-proof, but less convenient. As a last resort possibility the .opt framework gives the possibility to accumulate options in a vector and have them processed "by hand". It seemed more convenient to me to just use 2 bits per intrinsic, but we can fall back to manual processing if we get out of bits at some point.

Using only 1 bit per intrinsic was what I tried first, but then the default value has to be set before processing the flags, which means no default value depending on optimization, as optimization level is not known at that time. I tried using EnabledBy(O || Ofast || Og) to set default value dependending on optimization, but this doesn't work either because the "O" covers all levels (-O0, -O1, -O2, -O3) without distinction. And finally EnabledBy(O1 || O2 || O3 || Ofast || Og) is not accepted because "O1", "O2" and "O3" are not valid option names.

I can certainly drop the per-intrinsic setting to have unlimited room for future intrinsics. Or if you have ideas on how to only use one bit per intrinsic, I'm all ears. Or maybe the testcases are too cumbersome, and with a slight modification of behaviour we can save one bit (for example, also generate inline code for -O0). What do you think?

In the documentation, you have:

+The set of intrinsics permitting the choice of implementation variant
through
+@code{-finline-intrinsics} is currently limited to non-scalar
@code{MAXLOC} and
+@code{MINLOC}.

Why do you say "non-scalar"?  The new inlining is done for these
intrinsics when the DIM argument is absent.  The result characteristics
however is:

   "If DIM does not appear, the result is an array of rank one and of
size equal to the rank of ARRAY; ..."

and I thought the implementation does just that and does that right.
(With DIM present, the result is an array of rank rank(arg)-1.)
Can you clarify the wording in a way that is better understandable?

Yeah, these patches are all about non-scalar MINLOC/MAXLOC. But there is also the scalar MINLOC/MAXLOC case which has pre-existing inline code support (and on which these patches are based). The scalar case (aka with DIM present, and ARRAY of rank 1) is always inlined, so the flag has no effect on it.

Does this sound better?:
The set of intrinsics allowed as argument to @code{-finline-intrinsics=} is currently limited to @code{MAXLOC} and @code{MINLOC}. The effect of the flag is moreover limited to calls of those intrinsics without @code{DIM} argument and with @code{ARRAY} of a non-@code{CHARACTER} type.

Otherwise the Fortran parts look fine to me.

Thanks for the review.

For the changes to gcc/flag-types.h you might need an OK from the
gcc maintainers.

Thanks,
Harald

Reply via email to