"Jay Freeman (saurik)" <sau...@saurik.com> writes: >> As you know, I wanted to allow for future expansion. I agree that it >> would be possible to avoid storing MORESTACK_SEGMENTS--that would trade >> off space for time, since it would mean that setcontext would have to >> walk up the list. I think CURRENT_STACK is required for >> __splitstack_find_context. And __splitstack_find_context is required >> for Go's garbage collector. At least, it's not obvious to me how to >> avoid requiring CURRENT_STACK for that case. > > The basis of that suggestion was not just that the items in the > context could be removed, but that the underlying state used by split > stacks might not need the values at all. In this case, I am not > certain why __morestack_segments is needed: it seems to only come in > to play when __morestack_current_segment is NULL (and I'm not certain > how that would happen) and while deallocating dynamic blocks (which is > already linear).
I think I see what you mean. If you can eliminate __morestack_segments entirely, that is fine with me. >> > 7) Using the linker to handle the transition between split-stack and >> > non-split-stack code seems like a good way to solve the problem of "we >> > need large stacks when hitting external code", but in staring at the >> > resulting code I have in my project I'm seeing that it isn't reliable: >> > if you have a pointer to a function the linker will not know what you >> > are calling. In my case, this is coming up often due to using >> > std::function. >> >> Yes, good point. I think I had some plan for handling that but I no >> longer recall what it was. > > After getting more sleep, I realize that this problem is actually much > more endemic than I had even previously thought. Most any vaguely > object-oriented library is going to have tons of function pointers in > it, and you often interact solely with those function pointers (as in, > you have no actual symbol references anywhere). A simple example: in > the case of C++, any call to a non-split-stack virtual function will > fail. Certainly true in principle, but unlikely in practice. Why would you compile part of your C++ program with split-stack and part without? Implementing child classes that define virtual methods for classes defined in a precompiled library seems like an unusual case to me. > """Function pointers are a tricky case. In general we don't know > whether a function pointer points to split-stack code. Therefore, all > calls through a function pointer will be modified to call (or jump to) > a special function __fnptr_morestack. This will use a target specific > function calling sequence, and will be implemented as though it were > itself a function call instruction. That is, all the parameters will > be set up, and then the code will jump to __fnptr_morestack. The > __fnptr_morestack function takes two parameters: the function pointer > to call, and the number of bytes of arguments pushed on the > stack. (This is not yet implemented.)""" > > That paragraph is from your design document (SplitStacks on the GCC > wiki). I presume that this solution would only work if > __fnptr_morestack always assumed that the target did not support > split-stack? Alternatively, I can see having that stub look at the > function to see if its first instruction was a comparison to the TCB > stack limit entry (using similar logic to that used by the linker)? > [also, see below in this e-mail] So at least I did have a plan, even if I didn't really flesh it out or actually implement it. Yes, looking at the first instruction seems like a good way to tell whether a large stack much be allocated. I think this could be fairly efficient when using a function pointer to call split-stack code, something like 8 extra instructions and a memory load. >> > More awkwardly, split-stack functions that mention (but do not call) >> > non-split-stack functions (such as to return their address) are being >> > mis-flagged by the linker. Honestly, I question whether the linker >> > fundamentally has enough information about what is going on to be able >> > to make sufficiently accurate decisions with regards to stack >> > constraints to warrant the painful abstraction breakage that >> > split-stack uses. :( >> >> Your're right that the linker doesn't really have enough information. >> But is a split-stack function that returns the address of a >> non-split-stack function really so frequent that it's worth worrying >> about? > > I guess the question I have is: is one of the goals to make this > option "safe to turn on for a random project"? Given the abstraction > break that was made between the compiler and the linker, it would seem > like this was a rather critically important goal (as now both the > linker and the compiler are less modular and more difficult to > modify), but in fact the result doesn't manage to solve seemingly > simple corner cases. The abstraction break exists not because I thought it was a good idea, but because I couldn't see any other way to do it. The split-stack system needs to work in a world with precompiled libraries and where people will not change their source code. Any approach that requires people to rebuild the world, or to edit their source code, is a non-starter for me. I don't mind if there is some more efficient mechanism which works if we require those steps, but I wanted a system that would work where we do not require them. I'm willing to impose restrictions like "you must compile all your source code with -fsplit-stack;" I'm not willing to say "you must not use precompiled libraries." > That said, I can demonstrate a really common idiom, from C (not C++), > that is almost always going to involve non-split-stack code (as malloc > and free are normally going to be in libc, compiled without > -fsplit-stack), and that is morally equivalent to "returning a > function pointer and using it later": data structures that keep > information on a block of dynamically allocated memory and "how to > free it". Here's a lame version: Sure. Not handling function pointers safely is a real bug. > Part of me (and I realize that this causes other tradeoffs, and I'm > therefore not even recommending it: more just musing) feels like the > notion of "supports split stack" is more of a calling convention. In > the same way that gcc currently supports regparm, stdcall, thiscall, > fastcall... it seems like it might simply be a new attribute (probably > orthogonal to the calling convention) a function can have (and would > not have by default): splitcall. I'm fine with that, but it requires a source code change, so I want the system to work without it. >> > A specific idea that might help, however, is to set things up so that >> > the PLT actually handles the stack increases when you are linking to >> > functions that are in a dynamic library. That way, calls to open (for >> > example) would not cause the function that called it to suddenly >> > require a large stack, but instead only as control is transferred to >> > open would the stack size increase. (This might be quite complex, >> > though.) >> >> Yes, again you have to know how many bytes of arguments were pushed on >> the stack. You can pretty much know this for open, of course, but it's >> a lot more complex for printf (if printf were compiled in split-stack >> mode it would straightforward, but of course in this example it is >> not). >> >> I agree that this could be a lot nicer. It's a bit less important for >> Go because obviously the Go compiler is completely in control of all >> functions called by Go code. > > In this model (still using the linker, but pushing the stack-split > into or around the @plt stub function), I would have to propose that > variadic functions are treated specially (possibly using a > similar/identical setup to the one you were proposing for function > pointers) where the argument count was also passed. This could be > pushed onto the stack right before the call and popped/thrown off the > stack first thing in the stub when not needed (which has the benefit > of being portable between targets and not messing with the existing > argument placement). I'm not sure how this all hangs together, but perhaps it works. One issue is that it means that the caller and callee have to absolutely agree as to whether the function is variadic or not. E.g., since you mention open, it is often declared in <fcntl.h> as variadic, but some code declares it itself. While the language standards do require absolute agreement, it's usually best if you can avoid it requiring it for real programs. > Actually, thinking about it more: it seems like 99% of these problems > could be solved by providing a second symbol definition for the > split-stack prologue and binding that as part of the type > signature. So, you could either call the "original implementation" of > a function using its normal symbol, or you could call the split-stack > prologue version of the same function using one that had been mangled > with some prefix. > > extern "C" int test() { > return 0xdeadbeef; > } > > 0000000000404920 <test>: > 404920: 64 48 3b 24 25 70 00 cmp %fs:0x70,%rsp > 404927: 00 00 > 404929: 72 06 jb 404931 <test+0x11> > 40492b: b8 ef be ad de mov $0xdeadbeef,%eax > 404930: c3 retq > 404931: 45 31 d2 xor %r10d,%r10d > 404934: 45 31 db xor %r11d,%r11d > 404937: e8 6d 6b 00 00 callq 40b4a9 <__morestack> > 40493c: c3 retq > 40493d: eb ec jmp 40492b <test+0xb> > 40493f: 90 nop > > In this case (and yes: this is an example of a function that shouldn't > need this prologue at all, but it was short ;P), the existing > implementation of -fsplit-stack has modified the function to > fundamentally check its stack. No matter how you attempt to call it, > we now have to know whether the function supports the split-stack > protocol using an out-of-line mechanism, and we cannot enforce our > beliefs in the compiler: the linker is complete control of this > decision. However, we could instead have it do this: > > 0000000000404920 <.split.test>: > 404920: 64 48 3b 24 25 70 00 cmp %fs:0x70,%rsp > 404927: 00 00 > 404929: 72 06 jb 404931 <test+0x6> > 000000000040492b <test>: > 40492b: b8 ef be ad de mov $0xdeadbeef,%eax > 404930: c3 retq > 404931: 45 31 d2 xor %r10d,%r10d > 404934: 45 31 db xor %r11d,%r11d > 404937: e8 6d 6b 00 00 callq 40b4a9 <__morestack> > 40493c: c3 retq > 40493d: eb ec jmp 40492b <test> > 40493f: 90 nop > > Now the decision to call either test or .split.test becomes > explicit. This would allow us to get a linker error if we made an > incorrect decision in my earlier > not-really-a-suggestion-more-of-a-musing of making this knowledge > explicit in the compiler akin to a calling convention. If the compiler > decided that something wasn't split-stack, then it would just handle > allocating the larger stack before the call to the underlying > function; or, if it decided the function was split-stack, the linker > would enforce it, and the user would get a reasonable error. It's an interesting idea, but my immediate reaction is to ask how this helps us in the world where we do not require source code changes. Thanks for the thoughtful notes. Ian