On 2017/05/04 04:03PM, Michael Ellerman wrote: > "Naveen N. Rao" <naveen.n....@linux.vnet.ibm.com> writes: > > > On 2017/04/27 08:19PM, Michael Ellerman wrote: > >> "Naveen N. Rao" <naveen.n....@linux.vnet.ibm.com> writes: > >> > >> > It is actually safe to probe system_call() in entry_64.S, but only till > >> > .Lsyscall_exit. To allow this, convert .Lsyscall_exit to a non-local > >> > symbol __system_call() and blacklist that symbol, rather than > >> > system_call(). > >> > >> I'm not sure I like this. The reason we made it a local symbol in the > >> first place is because it made backtraces look odd: > >> > >> commit 4c3b21686111e0ac6018469dacbc5549f9915cf8 > >> Author: Michael Ellerman <m...@ellerman.id.au> > >> AuthorDate: Fri Dec 5 21:16:59 2014 +1100 > >> > >> powerpc/kernel: Make syscall_exit a local label > >> > >> Currently when we back trace something that is in a syscall we see > >> something like this: > >> > >> [c000000000000000] [c000000000000000] SyS_read+0x6c/0x110 > >> [c000000000000000] [c000000000000000] syscall_exit+0x0/0x98 > >> > >> Although it's entirely correct, seeing syscall_exit at the bottom can be > >> confusing - we were exiting from a syscall and then called SyS_read() ? > >> > >> If we instead change syscall_exit to be a local label we get something > >> more intuitive: > >> > >> [c0000001fa46fde0] [c00000000026719c] SyS_read+0x6c/0x110 > >> [c0000001fa46fe30] [c000000000009264] system_call+0x38/0xd0 > >> > >> ie. we were handling a system call, and it was SyS_read(). > >> > >> > >> I think you know that, although you didn't mention it in the change log, > >> because you've called the new symbol __system_call. But that is not a > >> great name either because that's not what it does. > > > > Yes, you're right. I used __system_call since I felt that it won't cause > > confusion like syscall_exit did. I agree it's not a great name, but we > > need _some_ label other than system_call if we want to allow probing at > > this point. > > > > Also, if I'm reading this right, there is no other place to probe if we > > want to capture all system call entries. > > > > So, I felt this would be good to have. > > > >> > >> > diff --git a/arch/powerpc/kernel/entry_64.S > >> > b/arch/powerpc/kernel/entry_64.S > >> > index 380361c0bb6a..e030ce34dd66 100644 > >> > --- a/arch/powerpc/kernel/entry_64.S > >> > +++ b/arch/powerpc/kernel/entry_64.S > >> > @@ -176,7 +176,7 @@ system_call: /* label this so stack > >> > traces look sane */ > >> > mtctr r12 > >> > bctrl /* Call handler */ > >> > > >> > -.Lsyscall_exit: > >> > +__system_call: > >> > std r3,RESULT(r1) > >> > CURRENT_THREAD_INFO(r12, r1) > >> > >> Why can't we kprobe the std and the rotate to current thread info? > >> > >> Is the real no-probe point just here, prior to the clearing of MSR_RI ? > >> > >> ld r8,_MSR(r1) > >> #ifdef CONFIG_PPC_BOOK3S > >> /* No MSR:RI on BookE */ > > > > We can probe at all those places, just not once MSR_RI is unset. So, the > > no-probe point is just *after* the mtmsrd. > > > > However, for kprobe blacklisting, the granularity is at a function level > > (or ASM labels). As such, we will have to blacklist all of > > syscall_exit/__system_call. > > Would this work? > > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S > index 767ef6d68c9e..8d0fa4a2262a 100644 > --- a/arch/powerpc/kernel/entry_64.S > +++ b/arch/powerpc/kernel/entry_64.S > @@ -207,6 +207,7 @@ system_call: /* label this so stack > traces look sane */ > mtmsrd r11,1 > #endif /* CONFIG_PPC_BOOK3E */ > > +syscall_exit: > ld r9,TI_FLAGS(r12) > li r11,-MAX_ERRNO > andi. > r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
Ah, nice. I previously incorrectly assumed that syscall_exit was not desirable throughout this function. Your earlier patch was only about what label showed up while _inside_ a syscall. So, syscall_exit post handling of a syscall is fine. This patch looks fine to me. I will test with this change and get back. Thanks, Naveen