Hi, I'm the author of ftrace support on arm64(aarch64) linux. As part of ftrace, we can utilize "stack tracer" which reports the maximum usage of kernel stack:
---8<--- # cat /sys/kernel/debug/tracing/stack_max_size 4088 # cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (49 entries) ----- ---- -------- 0) 4088 16 __local_bh_enable_ip+0x18/0xd8 1) 4072 32 _raw_read_unlock_bh+0x38/0x48 2) 4040 32 xs_udp_write_space+0x44/0x50 3) 4008 32 sock_wfree+0x88/0x90 4) 3976 32 skb_release_head_state+0x70/0xa0 [snip] 44) 808 32 load_elf_binary+0x29c/0x10d0 45) 776 224 search_binary_handler+0xbc/0x208 46) 552 96 do_execveat_common.isra.15+0x4e4/0x690 47) 456 112 SyS_execve+0x4c/0x60 48) 344 344 el0_svc_naked+0x24/0x28 --->8--- Here, "Depth" (and hence "Size") is determined, after scanning a stack, by saved fp pointer (more precisely + 0x10) in a stack frame instead of (not saved) stack pointer. (Please note that arm64 kernel is always compiled with -fno-omit-frame-pointer.) As fp is updated after branching into a function, and allocates not only a function's stack frame but also callee's local variables, using this saved value of fp as "Depth", or sp of a caller function, is not appropriate for calculating a stack size of a function. So I'd like to introduce a function prologue analyzer to determine a size allocated by a function's prologue and deduce it from "Depth". My implementation of this analyzer has been submitted to linux-arm-kernel mailing list[1]. I borrowed some ideas from gdb's analyzer[2], especially a loop of instruction decoding as well as stop of decoding at exiting a basic block, but implemented my own simplified one because gdb version seems to do a bit more than what we expect here. Anyhow, since it is somewhat heuristic (and may not be maintainable for a long term), could you review it from a broader viewpoint of toolchain, please? [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/393721.html [2] aarch64_analyze_prologue() in gdb/aarch64-tdep.c Thanks, -Takahiro AKASHI