On Fri, Jan 22, 2021 at 04:18:54PM +0000, Alexandre Truong wrote:
> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> use dwarf unwind info to check if the link register is the return
> address in order to inject it to the frame pointer stack.
> 
> Write the following application:
> 
>       int a = 10;
> 
>       void f2(void)
>       {
>               for (int i = 0; i < 1000000; i++)
>                       a *= a;
>       }
> 
>       void f1()
>       {
>               f2();
>       }
> 
>       int main (void)
>       {
>               f1();
>               return 0;
>       }
> 
> with the following compilation flags:
>       gcc -g -fno-omit-frame-pointer -fno-inline -O1
> 
> The compiler omits the frame pointer for f2 on arm. This is a problem
> with any leaf call, for example an application with many different
> calls to malloc() would always omit the calling frame, even if it
> can be determined.
> 
>       ./perf record --call-graph fp ./a.out
>       ./perf report
> 
> currently gives the following stack:
> 
> 0xffffea52f361
> _start
> __libc_start_main
> main
> f2

reproduced on x86 as well

> +static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> +{
> +     return callchain_param.record_mode != CALLCHAIN_FP || 
> !sample->user_regs.regs
> +             || sample->user_regs.mask != PERF_REGS_MASK;
> +}
> +
> +static int add_entry(struct unwind_entry *entry, void *arg)
> +{
> +     struct entries *entries = arg;
> +
> +     entries->stack[entries->i++] = entry->ip;
> +     return 0;
> +}
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread 
> *thread)
> +{
> +     u64 leaf_frame;
> +     struct entries entries = {{0, 0}, 0};
> +
> +     if (get_leaf_frame_caller_enabled(sample))

the name suggest you'd want to continue if it's true

> +             return 0;
> +
> +     unwind__get_entries(add_entry, &entries, thread, sample, 2);

I'm scratching my head how this unwinds anything, you enabled just
registers, not the stack right? so the unwind code would do just
IP -> LR + 1 shift?

thanks,
jirka

> +     leaf_frame = callchain_param.order == ORDER_CALLER ?
> +             entries.stack[0] : entries.stack[1];
> +
> +     if (leaf_frame + 1 == sample->user_regs.regs[PERF_REG_ARM64_LR])
> +             return sample->user_regs.regs[PERF_REG_ARM64_LR];
> +     return 0;
> +}

SNIP

Reply via email to