On 6/6/23 21:50, Wang, Yanzhang wrote:
Hi Jeff,

Thanks your comments. I have few questions that I don't quite understand.

One of the things that needs to be upstreamed is long jump support within
a function.  Essentially once a function reaches 1M in size we have the
real possibility that a direct jump may not reach its target.

To support this I expect that $ra is going to become a fixed register (ie,
not available to the register allocator as a temporary).  It'll be used
as a scratch register for long jump sequences.

One of the consequences of this is $ra will need to be saved in leaf
functions that are near or over 1M in size.

Note that at the time when we have to lay out the stack, we do not know
the precise length of the function.  So there's a degree of "fuzz" in the
decision whether or not to save $ra in a function that is close to the 1M
limit.

Do you mean that, long jump to more than 1M offset will need multiple jal
and each jal will save the $ra ?
Long jumps are implemnted as an indirect jump which needs a scratch register to hold the high part of the jump target address.


If yes, I'm confused about what's the influence of the $ra saving for
function prologue. We will save the fp+ra at the prologue, the next $ra
saving seems will not modify the $ra already saved.
The long branch handling is done at the assembler level. So the clobbering of $ra isn't visible to the compiler. Thus the compiler has to be extremely careful to not hold values in $ra because the assembler may clobber $ra.

This ultimately comes back to the phase ordering problem. At register allocation time we don't know if we need long jumps or not. So we don't know if $ra is potentially clobbered by the assembler. A similar phase ordering problems exists in the prologue/epilogue generation.

The other approach to long branch handling would be to do it all in the compiler. I would actually prefer this approach, but it's not likely to land in the near term.



I think it's yes (not valid) when we want to get the return address to parent
function from $ra directly in the function body. But we can get the right
return address from fp with offset if we save them at prologue, is it right ?
Right.  You'll be able to get the value of $ra out of the stack.




Meaning that what you really want is to be using -fno-omit-frame-pointer
and for $ra to always be saved in the stack, even in a leaf function.

This is also another solution but will change the default behavior of
-fno-omit-frame-pointer.
That's OK. While -f options are target independent options, targets are allowed to adjust certain behaviors based on those options.

If you're not going to use dwarf, then my recommendation is to ensure that the data you need is *always* available in the stack at known offsets. That will mean your code isn't optimized as well. It means hand written assembly code has to follow the conventions, you can't link against libraries that do not follow those conventions, etc etc. But that's the price you pay for not using dwarf (or presumably ORC/SFRAME which I haven't studied in detail).

Jeff






Jeff

Reply via email to