On 12/01/2014 04:37 AM, Masami Hiramatsu wrote: > (2014/11/29 1:01), Steve Capper wrote: >> On 27 November 2014 at 06:07, Masami Hiramatsu >> <masami.hiramatsu...@hitachi.com> wrote: >>> (2014/11/27 3:59), Steve Capper wrote: >>>> The crash is extremely easy to reproduce. >>>> >>>> I've not observed any missed events on a kprobe on an arm64 system >>>> that's still alive. >>>> My (limited!) understanding is that this suggests there could be a >>>> problem with how missed events from a recursive call to memcpy are >>>> being handled. >>> >>> I think so too. BTW, could you bisect that? :) >>> >> >> I can't bisect, but the following functions look suspicious to me >> (again I'm new to kprobes...): >> kprobes_save_local_irqflag >> kprobes_restore_local_irqflag >> >> I think these are breaking somehow when nested (i.e. from a recursive probe). > > Agreed. On x86, prev_kprobe has old_flags and saved_flags, this > at least must have saved_irqflag and save/restore it in > save/restore_previous_kprobe(). > > What about adding this? > > struct prev_kprobe { > struct kprobe *kp; > unsigned int status; > + unsigned long saved_irqflag; > }; > > and > > static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb) > { > kcb->prev_kprobe.kp = kprobe_running(); > kcb->prev_kprobe.status = kcb->kprobe_status; > + kcb->prev_kprobe.saved_irqflag = kcb->saved_irqflag; > } > > static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb) > { > __this_cpu_write(current_kprobe, kcb->prev_kprobe.kp); > kcb->kprobe_status = kcb->prev_kprobe.status; > + kcb->saved_irqflag = kcb->prev_kprobe.saved_irqflag; > } > >
I have noticed with the aarch64 kprobe patches and recent kernel I can get the machine to end up getting stuck and printing out endless strings of [187694.855843] Unexpected kernel single-step exception at EL1 [187694.861385] Unexpected kernel single-step exception at EL1 [187694.866926] Unexpected kernel single-step exception at EL1 [187694.872467] Unexpected kernel single-step exception at EL1 [187694.878009] Unexpected kernel single-step exception at EL1 [187694.883550] Unexpected kernel single-step exception at EL1 I can reproduce this pretty easily on my machine with functioncallcount.stp from https://sourceware.org/systemtap/examples/profiling/functioncallcount.stp and the following steps: # stap -p4 -k -m mm_probes -w functioncallcount.stp "*@mm/*.c" -c "sleep 1" # staprun mm_probes.ko -c "sleep 1" -Will > > >> That would explain why the state of play of the interrupts is in an >> unexpected state in the crash I reported: >> "The point of failure in the panic was: >> fs/buffer.c:1257 >> >> static inline void check_irqs_on(void) >> { >> #ifdef irqs_disabled >> BUG_ON(irqs_disabled()); >> #endif >> } >> " >> >> This is all new to me so I'm still at the head-scratching stage. > > Ah, I see. > > Thank you, > >> >> David, >> Does the above make sense to you? Have you managed to reproduce the crash I >> get? >> >> Cheers, >> -- >> Steve >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/