On Apr 12 2018, Thomas König wrote:
with Fortran 2018, recursive is becoming the default. This will likely have a serious impact on many user codes, which often declare large arrays which could then overflow stacks, leading to segfaults without further explanation.
Yes. Been there - seen that :-) What's worse, segfaults because of stack overflow very often confuse debuggers, so you can't even get a traceback of where it failed!
We could extend -fmax-stack-var-size so it allocates memory from the heap in recursive procedures, too, and set this to some default value. Of course, it would also have to free them afterwards, but we manage that for allocatable arrays already.
Yes, but I think it's a horrible idea. See below for a better one.
We could warn when large arrays with sizes known at compile time are translated.
Yes, but I think that's the wrong criterion. It should be above a certain size, probably aggregate per procedure - and controllable, of course. Who cares about a couple of 3x3 matrices?
We could use -fsplit-stack by default. How reliable is that option? Can it be used, for example, with -fopenmp? Is it available on all (relevant) platforms? One drawback would be that this would allow for infinite recursions to go on for much longer.
Yes. And I don't think it's the right mechanism, anyway, except for OpenMP. Again, see below.
A -fcheck=stack option could be added (implemented similar to -fsplit-stack), to be included in -fcheck=all, which would abort with a sensible error message instead of a segfault.
Absolutely. Or simply always check! I haven't studied the actual code generated by gfortran recently, but my experience of performing stack checking is that its cost is negligible. It got a bad name because of the utter incompetence of the way it was usually implemented. There is also a very simple optimisation that often avoids it: Leave a fixed amount of space beyond the check point and omit the check for any leaf procedure that uses no more than that. And, obviously, that can be extended to non-leaf procedures with known stack use, such as most of the intrinsics. There is another option, which I was thinking of experimenting with in my retirement, but probably won't, is a double stack (as in GNU Ada, and the better Algol 68 systems). Small, fixed objects go on the primary stack, as usual, and large or variable-sized ones go on the secondary stack. Allocatable objects should go there if and only if they are not reallocated. My experience (a long time back) was that the improved locality of the primary stack (often needed to control branching) by removing large arrays from it speeded up most calls with such arrays by several tens of percent. Now, there is an interesting interaction with split stacks. The only reason to have a contiguous stack is for fast procedure call for simple procedures. But that doesn't apply to the secondary stack, so it can always be split - hence there is no need for a configuration or run-time option. That doesn't stop it being checked against a maximum size, either, because accumulating, decrementing and checking a count isn't a major overhead for the use the secondary stack gets. Regards, Nick Maclaren.