Re: [RFC][AArch64] function prologue analyzer in linux kernel

Andrew Pinski Mon, 11 Jan 2016 23:02:29 -0800

On Mon, Jan 11, 2016 at 10:11 PM, AKASHI Takahiro
<[email protected]> wrote:
> Will,
>
>
> On 01/09/2016 12:53 AM, Will Deacon wrote:
>>
>> On Fri, Jan 08, 2016 at 02:36:32PM +0900, AKASHI Takahiro wrote:
>>>
>>> On 01/07/2016 11:56 PM, Richard Earnshaw (lists) wrote:
>>>>
>>>> On 07/01/16 14:22, Will Deacon wrote:
>>>>>
>>>>> On Thu, Dec 24, 2015 at 04:57:54PM +0900, AKASHI Takahiro wrote:
>>>>>>
>>>>>> So I'd like to introduce a function prologue analyzer to determine
>>>>>> a size allocated by a function's prologue and deduce it from "Depth".
>>>>>> My implementation of this analyzer has been submitted to
>>>>>> linux-arm-kernel mailing list[1].
>>>>>> I borrowed some ideas from gdb's analyzer[2], especially a loop of
>>>>>> instruction decoding as well as stop of decoding at exiting a basic
>>>>>> block,
>>>>>> but implemented my own simplified one because gdb version seems to do
>>>>>> a bit more than what we expect here.
>>>>>> Anyhow, since it is somewhat heuristic (and may not be maintainable
>>>>>> for
>>>>>> a long term), could you review it from a broader viewpoint of
>>>>>> toolchain,
>>>>>> please?
>>>>>>
>>>>> My main issue with this is that we cannot rely on the frame layout
>>>>> generated by the compiler and there's little point in asking for
>>>>> commitment here. Therefore, the heuristics will need updating as and
>>>>> when we identify new frames that we can't handle. That's pretty fragile
>>>>> and puts us on the back foot when faced with newer compilers. This
>>>>> might
>>>>> be sustainable if we don't expect to encounter much variation, but even
>>>>> that would require some sort of "buy-in" from the various toolchain
>>>>> communities.
>>>>>
>>>>> GCC already has an option (-fstack-usage) to determine the stack usage
>>>>> on a per-function basis and produce a report at build time. Why can't
>>>>> we use that to provide the information we need, rather than attempt to
>>>>> compute it at runtime based on your analyser?
>>>>>
>>>>> If -fstack-usage is not sufficient, understanding why might allow us to
>>>>> propose a better option.
>>>>
>>>>
>>>> Can you not use the dwarf frame unwind data?  That's always sufficient
>>>> to recover the CFA (canonical frame address - the value in SP when
>>>> executing the first instruction in a function).  It seems to me it's
>>>> unlikely you're going to need something that's an exceedingly high
>>>> performance operation.
>>>
>>>
>>> Thank you for your comment.
>>> Yeah, but we need some utility routines to handle unwind
>>> data(.debug_frame).
>>> In fact, some guy has already attempted to merge (part of) libunwind into
>>> the kernel[1], but it was rejected by the kernel community (including
>>> Linus
>>> if I correctly remember). It seems that they thought the code was still
>>> buggy.
>>
>>
>> The ARC guys seem to have sneaked something in for their architecture:
>>
>>
>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arc/kernel/unwind.c
>>
>> so it might not be impossible if we don't require all the bells and
>> whistles of libunwind.
>
>
> Thanks. I didn't notice this code.
>
>>> That is one of reasons that I wanted to implement my own analyzer.
>>
>>
>> I still don't understand why you can't use fstack-usage. Can you please
>> tell me why that doesn't work? Am I missing something?
>
>
> I don't know how gcc calculates the usage here, but I guess it would be more
> robust than my analyzer.


-fstack-usage does not work when there are VLAs or alloca's.  So there
is no way to figure that part out without analysis of the actual
assembly code.
I still think dwarf unwind info is the way forward.

Thanks,
Andrew


>
> The issues, that come up to my mind, are
> - -fstack-usage generates a separate output file, *.su and so we have to
>   manage them to be incorporated in the kernel binary.
>   This implies that (common) kernel makefiles might have to be a bit
> changed.
> - more worse, what if kernel module case? We will have no way to let the
> kernel
>   know the stack usage without adding an extra step at loading.
>
> If we need to put some information about stack usage in the kernel, that
> should
> not be much different from dwarf frame data (.eh_frame).
>
> -Takahiro AKASHI
>
>
>> Will
>>
>

Re: [RFC][AArch64] function prologue analyzer in linux kernel

Reply via email to