On Thu, 2016-04-07 at 14:56 +0530, Naveen N. Rao wrote: > On 2016/04/07 06:19PM, Balbir Singh wrote: > > > > > > On 06/04/16 22:32, Naveen N. Rao wrote: > > > > > > This patchset fixes three issues found with perf probe on ppc64le: > > > 1. 'perf test kallsyms' failure on ppc64le (reported by Michael > > > Ellerman). This was due to the symbols being fixed up during symbol > > > table load. This is fixed in patch 2 by delaying symbol fixup until > > > later. > > > 2. perf probe function offset was being calculated from the local entry > > > point (LEP), which does not match user expectation when trying to look > > > at function disassembly output (reported by Ananth N). This is fixed for > > > kallsyms in patch 1 and for symbol table in patch 2. > > I think the bit where the offset is w.r.t LEP when using a name, but w.r.t > > GEP when using function+offset can be confusing. > Thanks for your review! > > The rationale for this is actually from the end-user perspective. The > two use cases we are considering are: > 1. User just wants to probe at function entry point: > # perf probe _do_fork > > In this case, the user most definitely needs the local entry point, > without which the probe won't be hit. So, for this case, we > automatically insert the probe at the LEP. > > [We really only want to alter perf probe behavior in this case only, but > we were incorrectly changing the behavior of perf with the below > scenario as well.] > > 2. User wants to probe at a specific location. In this case, the user > most likely starts by looking at the function disassembly. For instance: > # objdump -S -d vmlinux.bak | grep -A100 \<_do_fork\>: > c0000000000b6a00 <_do_fork>: > unsigned long stack_start, > unsigned long stack_size, > int __user *parent_tidptr, > int __user *child_tidptr, > unsigned long tls) > { > c0000000000b6a00: f7 00 4c 3c addis r2,r12,247 > c0000000000b6a04: 00 86 42 38 addi r2,r2,-31232 > c0000000000b6a08: a6 02 08 7c mflr r0 > c0000000000b6a0c: d0 ff 41 fb std r26,-48(r1) > c0000000000b6a10: 26 80 90 7d mfocrf r12,8 > ...<snip>... > if (!(clone_flags & CLONE_UNTRACED)) { > c0000000000b6a54: e3 4f c7 7b rldicl. r7,r30,41,63 > c0000000000b6a58: 2c 00 82 40 bne c0000000000b6a84 ><_do_fork+0x84> > if (clone_flags & CLONE_VFORK) > c0000000000b6a5c: e3 97 c8 7b rldicl. r8,r30,50,63 > c0000000000b6a60: a0 01 82 41 beq c0000000000b6c00 ><_do_fork+0x200> > c0000000000b6a64: 20 00 20 39 li r9,32 > trace = PTRACE_EVENT_VFORK; > c0000000000b6a68: 02 00 80 3b li r28,2 > c0000000000b6a6c: 10 02 4d e9 ld r10,528(r13) > > If the user wants to probe at _do_fork+0x54, he'd do: > # perf probe _do_fork+0x54 > > With the earlier approach, we would insert the probe at _do_fork+0x5c > (0x54 from the LEP) instead, which is incorrect. > > In reality, user would probably just use debuginfo: > # perf probe -L _do_fork > <_do_fork@/root/linus/kernel/fork.c:0> > 0 long _do_fork(unsigned long clone_flags, > unsigned long stack_start, > unsigned long stack_size, > int __user *parent_tidptr, > int __user *child_tidptr, > unsigned long tls) > 6 { > struct task_struct *p; > 8 int trace = 0; > long nr; > > /* > * Determine whether and which event to report to >ptracer. When > * called from kernel_thread or CLONE_UNTRACED is >explicitly > * requested, no event is reported; otherwise, report >if the event > * for the type of forking is enabled. > */ > 17 if (!(clone_flags & CLONE_UNTRACED)) { > 18 if (clone_flags & CLONE_VFORK) > 19 trace = PTRACE_EVENT_VFORK; > 20 else if ((clone_flags & CSIGNAL) != SIGCHLD) > 21 trace = PTRACE_EVENT_CLONE; > > # perf probe _do_fork:17 > > In this case, perf chooses the right address based on DWARF. The current > patchset matches the behavior of perf without debuginfo with this.
I agree what I worry is that perf probe _do_fork sets a breakpoint after perf probe _do_fork+0x4. I am not sure if there is an easy solution to the problem. Balbir _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev