Hi Jeff, sorry for the late reply. > The long branch handling is done at the assembler level. So the clobbering > of $ra isn't visible to the compiler. Thus the compiler has to be > extremely careful to not hold values in $ra because the assembler may > clobber $ra.
If assembler will modify the $ra behavior, it seems the rules we defined in the riscv.cc will be ignored. For example, the $ra saving generated by this patch may be modified by the assmebler and all others depends on it will be wrong. So implementing the long jump in the compiler is better. Do I understand it correctly ? > If you're not going to use dwarf, then my recommendation is to ensure that > the data you need is *always* available in the stack at known > offsets. That will mean your code isn't optimized as well. It means > hand written assembly code has to follow the conventions, you can't link > against libraries that do not follow those conventions, etc etc. But > that's the price you pay for not using dwarf (or presumably ORC/SFRAME > which I haven't studied in detail). Yes. That's right. All the libraries need to follow the same logic. But as you said, this is the price if we choose this solution. And fortunately, this will only be used in special scenarios. --- And Jeff, do you have any other comments about this patch? Should we add some descriptions somewhere in the doc? Thanks, Yanzhang > -----Original Message----- > From: Jeff Law <jeffreya...@gmail.com> > Sent: Thursday, June 8, 2023 11:05 PM > To: Wang, Yanzhang <yanzhang.w...@intel.com>; gcc-patches@gcc.gnu.org > Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 > <pan2...@intel.com> > Subject: Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support. > > > > On 6/6/23 21:50, Wang, Yanzhang wrote: > > Hi Jeff, > > > > Thanks your comments. I have few questions that I don't quite understand. > > > >> One of the things that needs to be upstreamed is long jump support > >> within a function. Essentially once a function reaches 1M in size we > >> have the real possibility that a direct jump may not reach its target. > >> > >> To support this I expect that $ra is going to become a fixed register > >> (ie, not available to the register allocator as a temporary). It'll > >> be used as a scratch register for long jump sequences. > >> > >> One of the consequences of this is $ra will need to be saved in leaf > >> functions that are near or over 1M in size. > >> > >> Note that at the time when we have to lay out the stack, we do not > >> know the precise length of the function. So there's a degree of > >> "fuzz" in the decision whether or not to save $ra in a function that > >> is close to the 1M limit. > > > > Do you mean that, long jump to more than 1M offset will need multiple > > jal and each jal will save the $ra ? > Long jumps are implemnted as an indirect jump which needs a scratch > register to hold the high part of the jump target address. > > > > > If yes, I'm confused about what's the influence of the $ra saving for > > function prologue. We will save the fp+ra at the prologue, the next > > $ra saving seems will not modify the $ra already saved. > The long branch handling is done at the assembler level. So the clobbering > of $ra isn't visible to the compiler. Thus the compiler has to be > extremely careful to not hold values in $ra because the assembler may > clobber $ra. > > This ultimately comes back to the phase ordering problem. At register > allocation time we don't know if we need long jumps or not. So we don't > know if $ra is potentially clobbered by the assembler. A similar phase > ordering problems exists in the prologue/epilogue generation. > > The other approach to long branch handling would be to do it all in the > compiler. I would actually prefer this approach, but it's not likely to > land in the near term. > > > > > > I think it's yes (not valid) when we want to get the return address to > > parent function from $ra directly in the function body. But we can get > > the right return address from fp with offset if we save them at prologue, > is it right ? > Right. You'll be able to get the value of $ra out of the stack. > > > > > > >> Meaning that what you really want is to be using > >> -fno-omit-frame-pointer and for $ra to always be saved in the stack, > even in a leaf function. > > > > This is also another solution but will change the default behavior of > > -fno-omit-frame-pointer. > That's OK. While -f options are target independent options, targets are > allowed to adjust certain behaviors based on those options. > > If you're not going to use dwarf, then my recommendation is to ensure that > the data you need is *always* available in the stack at known > offsets. That will mean your code isn't optimized as well. It means > hand written assembly code has to follow the conventions, you can't link > against libraries that do not follow those conventions, etc etc. But > that's the price you pay for not using dwarf (or presumably ORC/SFRAME > which I haven't studied in detail). > > Jeff > > > > > > > Jeff