On 6/4/23 20:49, Wang, Yanzhang wrote:
Hi Jeff,
Yes, there's a requirement to support backtrace based on the fp+ra.
And the unwind/cfa is not acceptable because it will add additional
sections to the binary. Currently, -fno-omit-frame-pointer can not
save the ra for the leaf function. So we need to add another option
like ARM/X86 to support consistent fp+ra stack layout for the leaf
and non-leaf functions.
One of the things that needs to be upstreamed is long jump support
within a function. Essentially once a function reaches 1M in size we
have the real possibility that a direct jump may not reach its target.
To support this I expect that $ra is going to become a fixed register
(ie, not available to the register allocator as a temporary). It'll be
used as a scratch register for long jump sequences.
One of the consequences of this is $ra will need to be saved in leaf
functions that are near or over 1M in size.
Note that at the time when we have to lay out the stack, we do not know
the precise length of the function. So there's a degree of "fuzz" in
the decision whether or not to save $ra in a function that is close to
the 1M limit.
I don't think you can reliably know if $ra is valid in an arbitrary leaf
function or not. You could implement some heuristics by looking at the
symbol table (which I'm guessing you don't want to do) or by
disassembling the prologue (again, I'm guessing you don't want to do
that either).
Meaning that what you really want is to be using -fno-omit-frame-pointer
and for $ra to always be saved in the stack, even in a leaf function.
Presumably you're not suggesting any of these options be used in general
-- they're going to be used for things like embedded devices or
firmware? Also note there are low overhead unwinding schemes out there
that are already supported in various tools -- ORC & SFRAME come
immediately to mind. Those may be better than building a bespoke
solution for the embedded space.
Jeff