On Mon, Jan 11, 2016 at 10:11 PM, AKASHI Takahiro <takahiro.aka...@linaro.org> wrote: > Will, > > > On 01/09/2016 12:53 AM, Will Deacon wrote: >> >> On Fri, Jan 08, 2016 at 02:36:32PM +0900, AKASHI Takahiro wrote: >>> >>> On 01/07/2016 11:56 PM, Richard Earnshaw (lists) wrote: >>>> >>>> On 07/01/16 14:22, Will Deacon wrote: >>>>> >>>>> On Thu, Dec 24, 2015 at 04:57:54PM +0900, AKASHI Takahiro wrote: >>>>>> >>>>>> So I'd like to introduce a function prologue analyzer to determine >>>>>> a size allocated by a function's prologue and deduce it from "Depth". >>>>>> My implementation of this analyzer has been submitted to >>>>>> linux-arm-kernel mailing list[1]. >>>>>> I borrowed some ideas from gdb's analyzer[2], especially a loop of >>>>>> instruction decoding as well as stop of decoding at exiting a basic >>>>>> block, >>>>>> but implemented my own simplified one because gdb version seems to do >>>>>> a bit more than what we expect here. >>>>>> Anyhow, since it is somewhat heuristic (and may not be maintainable >>>>>> for >>>>>> a long term), could you review it from a broader viewpoint of >>>>>> toolchain, >>>>>> please? >>>>>> >>>>> My main issue with this is that we cannot rely on the frame layout >>>>> generated by the compiler and there's little point in asking for >>>>> commitment here. Therefore, the heuristics will need updating as and >>>>> when we identify new frames that we can't handle. That's pretty fragile >>>>> and puts us on the back foot when faced with newer compilers. This >>>>> might >>>>> be sustainable if we don't expect to encounter much variation, but even >>>>> that would require some sort of "buy-in" from the various toolchain >>>>> communities. >>>>> >>>>> GCC already has an option (-fstack-usage) to determine the stack usage >>>>> on a per-function basis and produce a report at build time. Why can't >>>>> we use that to provide the information we need, rather than attempt to >>>>> compute it at runtime based on your analyser? >>>>> >>>>> If -fstack-usage is not sufficient, understanding why might allow us to >>>>> propose a better option. >>>> >>>> >>>> Can you not use the dwarf frame unwind data? That's always sufficient >>>> to recover the CFA (canonical frame address - the value in SP when >>>> executing the first instruction in a function). It seems to me it's >>>> unlikely you're going to need something that's an exceedingly high >>>> performance operation. >>> >>> >>> Thank you for your comment. >>> Yeah, but we need some utility routines to handle unwind >>> data(.debug_frame). >>> In fact, some guy has already attempted to merge (part of) libunwind into >>> the kernel[1], but it was rejected by the kernel community (including >>> Linus >>> if I correctly remember). It seems that they thought the code was still >>> buggy. >> >> >> The ARC guys seem to have sneaked something in for their architecture: >> >> >> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arc/kernel/unwind.c >> >> so it might not be impossible if we don't require all the bells and >> whistles of libunwind. > > > Thanks. I didn't notice this code. > >>> That is one of reasons that I wanted to implement my own analyzer. >> >> >> I still don't understand why you can't use fstack-usage. Can you please >> tell me why that doesn't work? Am I missing something? > > > I don't know how gcc calculates the usage here, but I guess it would be more > robust than my analyzer.
-fstack-usage does not work when there are VLAs or alloca's. So there is no way to figure that part out without analysis of the actual assembly code. I still think dwarf unwind info is the way forward. Thanks, Andrew > > The issues, that come up to my mind, are > - -fstack-usage generates a separate output file, *.su and so we have to > manage them to be incorporated in the kernel binary. > This implies that (common) kernel makefiles might have to be a bit > changed. > - more worse, what if kernel module case? We will have no way to let the > kernel > know the stack usage without adding an extra step at loading. > > If we need to put some information about stack usage in the kernel, that > should > not be much different from dwarf frame data (.eh_frame). > > -Takahiro AKASHI > > >> Will >> >