On Apr 12 2018, Thomas König wrote:

with Fortran 2018, recursive is becoming the default. This will likely
have a serious impact on many user codes, which often declare large
arrays which could then overflow stacks, leading to segfaults without
further explanation.

Yes.  Been there - seen that :-)  What's worse, segfaults because of
stack overflow very often confuse debuggers, so you can't even get a
traceback of where it failed!

We could extend -fmax-stack-var-size so it allocates memory
from the heap in recursive procedures, too, and set this to
some default value.  Of course, it would also have to free them
afterwards, but we manage that for allocatable arrays already.

Yes, but I think it's a horrible idea.  See below for a better one.

We could warn when large arrays with sizes known at compile time
are translated.

Yes, but I think that's the wrong criterion.  It should be above a
certain size, probably aggregate per procedure - and controllable,
of course.  Who cares about a couple of 3x3 matrices?

We could use -fsplit-stack by default. How reliable is that
option? Can it be used, for example, with -fopenmp?
Is it available on all (relevant) platforms?
One drawback would be that this would allow for infinite
recursions to go on for much longer.

Yes.  And I don't think it's the right mechanism, anyway, except for
OpenMP.  Again, see below.

A -fcheck=stack option could be added (implemented similar to
-fsplit-stack), to be included in -fcheck=all, which would abort
with a sensible error message instead of a segfault.

Absolutely.  Or simply always check!  I haven't studied the actual code
generated by gfortran recently, but my experience of performing stack
checking is that its cost is negligible.  It got a bad name because of
the utter incompetence of the way it was usually implemented.  There is
also a very simple optimisation that often avoids it:

Leave a fixed amount of space beyond the check point and omit the check
for any leaf procedure that uses no more than that.  And, obviously,
that can be extended to non-leaf procedures with known stack use, such
as most of the intrinsics.

There is another option, which I was thinking of experimenting with in
my retirement, but probably won't, is a double stack (as in GNU Ada, and
the better Algol 68 systems).  Small, fixed objects go on the primary
stack, as usual, and large or variable-sized ones go on the secondary
stack.  Allocatable objects should go there if and only if they are not
reallocated.  My experience (a long time back) was that the improved
locality of the primary stack (often needed to control branching) by
removing large arrays from it speeded up most calls with such arrays by
several tens of percent.

Now, there is an interesting interaction with split stacks.  The only
reason to have a contiguous stack is for fast procedure call for simple
procedures.  But that doesn't apply to the secondary stack, so it can
always be split - hence there is no need for a configuration or run-time
option.  That doesn't stop it being checked against a maximum size,
either, because accumulating, decrementing and checking a count isn't a
major overhead for the use the secondary stack gets.


Regards,
Nick Maclaren.


Reply via email to