Hi Jeff,

Thanks your comments. I have few questions that I don't quite understand.

> One of the things that needs to be upstreamed is long jump support within
> a function.  Essentially once a function reaches 1M in size we have the
> real possibility that a direct jump may not reach its target.
> 
> To support this I expect that $ra is going to become a fixed register (ie,
> not available to the register allocator as a temporary).  It'll be used
> as a scratch register for long jump sequences.
> 
> One of the consequences of this is $ra will need to be saved in leaf
> functions that are near or over 1M in size.
> 
> Note that at the time when we have to lay out the stack, we do not know
> the precise length of the function.  So there's a degree of "fuzz" in the
> decision whether or not to save $ra in a function that is close to the 1M
> limit.

Do you mean that, long jump to more than 1M offset will need multiple jal
and each jal will save the $ra ?

If yes, I'm confused about what's the influence of the $ra saving for
function prologue. We will save the fp+ra at the prologue, the next $ra 
saving seems will not modify the $ra already saved.

> I don't think you can reliably know if $ra is valid in an arbitrary leaf
> function or not.  You could implement some heuristics by looking at the
> symbol table (which I'm guessing you don't want to do) or by
> disassembling the prologue (again, I'm guessing you don't want to do that
> either).

I think it's yes (not valid) when we want to get the return address to parent
function from $ra directly in the function body. But we can get the right
return address from fp with offset if we save them at prologue, is it right ?

> Meaning that what you really want is to be using -fno-omit-frame-pointer
> and for $ra to always be saved in the stack, even in a leaf function.

This is also another solution but will change the default behavior of
-fno-omit-frame-pointer.

> Presumably you're not suggesting any of these options be used in general
> -- they're going to be used for things like embedded devices or firmware?
> Also note there are low overhead unwinding schemes out there that are
> already supported in various tools -- ORC & SFRAME come
> immediately to mind.   Those may be better than building a bespoke
> solution for the embedded space.

Yes. You're right, I forget to introduce background of the requirement. It
will be used in the firmware where the dwarf or unwinding maybe not acceptable.

Yanzhang

> -----Original Message-----
> From: Jeff Law <jeffreya...@gmail.com>
> Sent: Wednesday, June 7, 2023 10:13 AM
> To: Wang, Yanzhang <yanzhang.w...@intel.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> <pan2...@intel.com>
> Subject: Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> 
> 
> On 6/4/23 20:49, Wang, Yanzhang wrote:
> > Hi Jeff,
> >
> > Yes, there's a requirement to support backtrace based on the fp+ra.
> > And the unwind/cfa is not acceptable because it will add additional
> > sections to the binary. Currently, -fno-omit-frame-pointer can not
> > save the ra for the leaf function. So we need to add another option
> > like ARM/X86 to support consistent fp+ra stack layout for the leaf and
> > non-leaf functions.
> One of the things that needs to be upstreamed is long jump support within
> a function.  Essentially once a function reaches 1M in size we have the
> real possibility that a direct jump may not reach its target.
> 
> To support this I expect that $ra is going to become a fixed register (ie,
> not available to the register allocator as a temporary).  It'll be used
> as a scratch register for long jump sequences.
> 
> One of the consequences of this is $ra will need to be saved in leaf
> functions that are near or over 1M in size.
> 
> Note that at the time when we have to lay out the stack, we do not know
> the precise length of the function.  So there's a degree of "fuzz" in the
> decision whether or not to save $ra in a function that is close to the 1M
> limit.
> 
> I don't think you can reliably know if $ra is valid in an arbitrary leaf
> function or not.  You could implement some heuristics by looking at the
> symbol table (which I'm guessing you don't want to do) or by
> disassembling the prologue (again, I'm guessing you don't want to do that
> either).
> 
> Meaning that what you really want is to be using -fno-omit-frame-pointer
> and for $ra to always be saved in the stack, even in a leaf function.
> 
> Presumably you're not suggesting any of these options be used in general
> -- they're going to be used for things like embedded devices or firmware?
> Also note there are low overhead unwinding schemes out there that are
> already supported in various tools -- ORC & SFRAME come
> immediately to mind.   Those may be better than building a bespoke
> solution for the embedded space.
> 
> 
> 
> Jeff

Reply via email to