On 12/03/2014 08:16 PM, William Cohen wrote:
> On 12/03/2014 05:54 PM, David Long wrote:
>> On 12/03/14 09:54, William Cohen wrote:
>>> On 12/01/2014 04:37 AM, Masami Hiramatsu wrote:
>>>> (2014/11/29 1:01), Steve Capper wrote:
>>>>> On 27 November 2014 at 06:07, Masami Hiramatsu
>>>>> <masami.hiramatsu...@hitachi.com> wrote:
>>>>>> (2014/11/27 3:59), Steve Capper wrote:
>>>>>>> The crash is extremely easy to reproduce.
>>>>>>>
>>>>>>> I've not observed any missed events on a kprobe on an arm64 system
>>>>>>> that's still alive.
>>>>>>> My (limited!) understanding is that this suggests there could be a
>>>>>>> problem with how missed events from a recursive call to memcpy are
>>>>>>> being handled.
>>>>>>
>>>>>> I think so too. BTW, could you bisect that? :)
>>>>>>
>>>>>
>>>>> I can't bisect, but the following functions look suspicious to me
>>>>> (again I'm new to kprobes...):
>>>>> kprobes_save_local_irqflag
>>>>> kprobes_restore_local_irqflag
>>>>>
>>>>> I think these are breaking somehow when nested (i.e. from a recursive 
>>>>> probe).
>>>>
>>>> Agreed. On x86, prev_kprobe has old_flags and saved_flags, this
>>>> at least must have saved_irqflag and save/restore it in
>>>> save/restore_previous_kprobe().
>>>>
>>>> What about adding this?
>>>>
>>>>   struct prev_kprobe {
>>>>       struct kprobe *kp;
>>>>       unsigned int status;
>>>> +    unsigned long saved_irqflag;
>>>>   };
>>>>
>>>> and
>>>>
>>>>   static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
>>>>   {
>>>>       kcb->prev_kprobe.kp = kprobe_running();
>>>>       kcb->prev_kprobe.status = kcb->kprobe_status;
>>>> +    kcb->prev_kprobe.saved_irqflag = kcb->saved_irqflag;
>>>>   }
>>>>
>>>>   static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
>>>>   {
>>>>       __this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
>>>>       kcb->kprobe_status = kcb->prev_kprobe.status;
>>>> +    kcb->saved_irqflag = kcb->prev_kprobe.saved_irqflag;
>>>>   }
>>>>
>>>>
>>>
>>> I have noticed with the aarch64 kprobe patches and recent kernel I can get 
>>> the machine to end up getting stuck and printing out endless strings of
>>>
>>> [187694.855843] Unexpected kernel single-step exception at EL1
>>> [187694.861385] Unexpected kernel single-step exception at EL1
>>> [187694.866926] Unexpected kernel single-step exception at EL1
>>> [187694.872467] Unexpected kernel single-step exception at EL1
>>> [187694.878009] Unexpected kernel single-step exception at EL1
>>> [187694.883550] Unexpected kernel single-step exception at EL1
>>>
>>> I can reproduce this pretty easily on my machine with functioncallcount.stp 
>>> from 
>>> https://sourceware.org/systemtap/examples/profiling/functioncallcount.stp 
>>> and the following steps:
>>>
>>> # stap -p4 -k -m mm_probes -w functioncallcount.stp "*@mm/*.c" -c "sleep 1"
>>> # staprun mm_probes.ko -c "sleep 1"
>>>
>>> -Will
>>
>> I did a fresh checkout and build of systemtap and tried the above.  I'm not 
>> yet seeing this problem.  It does remind me of the problem we saw before 
>> debug exception handling in entry.S was patched in v3.18-rc1, but you say 
>> you are using recent kernel sources.
>>
> 
> Hi Dave,
> 
> I saw this problem with a 3.18.0-rc5 based kernel.  Today I built a kernel 
> based on  3.18.0-0.rc6.git0.1.x1 with the patches and I didn't see the 
> problem with the unexpected kernel single-step exception.  I am not sure if 
> maybe there was some problem function being probed in the 3.18.0-rc5 kernel 
> but not with the 3.18.0-rc6 kernel or maybe some difference in the config 
> between the kernels. It seemed wiser to mention it.
> 

I saw this problem with the 3.18.0-rc6 kernel today. Note that this kernel did 
not have the patch for save_irqflag masami suggested above.  It seems to be an 
intermittent problem and doesn't occur every time.  The particular systemtap 
test that is triggering the problem installs a lot of probe points and this 
could be triggering some problem with nested kprobes.

-Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to