On 19/08/16 13:03, Christophe Leroy wrote: > > > Le 17/08/2016 à 17:27, Holger Brunck a écrit : >> On 16/08/16 19:27, christophe leroy wrote: >>> >>> >>> Le 15/08/2016 à 18:19, Dave Hansen a écrit : >>>> On 08/15/2016 07:35 AM, Holger Brunck wrote: >>>>> I tried this but unfortunately the error only occurs while remote >>>>> debugging. >>>>> Locally with gdb everything works fine. BTW we double-checked with a 85xx >>>>> ppc >>>>> target which is also 32-bit and it ends up with the same behaviour. >>>>> >>>>> I was also investigating where I have to move the line in the struct >>>>> task_struct >>>>> and it turns out to be like this (diff to 4.7 kernel): >>>>> >>>>> diff --git a/include/linux/sched.h b/include/linux/sched.h >>>>> index 253538f..4868874 100644 >>>>> --- a/include/linux/sched.h >>>>> +++ b/include/linux/sched.h >>>>> @@ -1655,7 +1655,9 @@ struct task_struct { >>>>> struct signal_struct *signal; >>>>> struct sighand_struct *sighand; >>>>> >>>>> + // struct thread_struct thread; // until here everything is fine >>>>> sigset_t blocked, real_blocked; >>>>> + struct thread_struct thread; // from here it's broken >>>>> sigset_t saved_sigmask; /* restored if set_restore_sigmask() was >>>>> used */ >>>>> struct sigpending pending; >>>> >>>> Wow, thanks for all the debugging here! >>>> >>>> So, we know it has to do with signals, thread_info, and probably only >>>> affects 32-bit powerpc. Seems awfully weird. Have you checked with any >>>> of the 64-bit powerpc guys to see if they have any ideas? >>>> >>>> I went grepping around for a bit. >>>> >>>> Where is the task_struct stored? Is it on-stack on ppc32 or something? >>>> The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD >>>> (thread struct) math happening in here, which confuses me: >>>> >>>> .globl ret_from_debug_exc >>>> ret_from_debug_exc: >>>> mfspr r9,SPRN_SPRG_THREAD >>>> lwz r10,SAVED_KSP_LIMIT(r1) >>>> stw r10,KSP_LIMIT(r9) >>>> lwz r9,THREAD_INFO-THREAD(r9) >>>> CURRENT_THREAD_INFO(r10, r1) >>>> lwz r10,TI_PREEMPT(r10) >>>> stw r10,TI_PREEMPT(r9) >>>> RESTORE_xSRR(SRR0,SRR1); >>>> RESTORE_xSRR(CSRR0,CSRR1); >>>> RESTORE_MMU_REGS; >>>> RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI) >>>> >>>> But, I'm really at a loss to explain this. It still seems like a deeply >>>> ppc-specific issue. We can obviously work around it with an #ifdef for >>>> your platform, but that's awfully hackish and hides the real bug, >>>> whatever it is. >>>> >>>> My suspicion is that there's a bug in the 32-bit ppc assembly somewhere. >>>> I don't see any references to 'blocked' or 'real_blocked' in assembly >>>> though. You could add a bunch of padding instead of moving the >>>> thread_struct and see if that does anything, but that's really a stab in >>>> the dark. >>>> >>> >>> Just to let you know, I'm not sure it is the same issue, but I also get >>> my 8xx target stuck when I try to use gdbserver. >>> >>> If I debug a very small app, it gets stuck quickly after the app has >>> stopped: indeed, the console seems ok but as soon as I try to execute >>> something simple, like a ps or top, it get stuck. The target still >>> responds to pings, but nothing else. >>> >>> If I debug a big app, it gets stuck soon after the start of debug: I set >>> a bpoint at main(), do a 'continue', get breaked at main(), do some >>> steps with 'next' then it gets stuck. >>> >>> I have tried moving the struct thread_struct thread but it has no impact. >>> >> >> that sounds a bit different to what I see. Is your program also >> mutli-threaded? >> >> Maybe you could try with the program I use to reproduce the error: >> >> --- snip ----- >> #include <pthread.h> >> #include <stdio.h> >> #include <unistd.h> >> >> void * th_1_func() >> { >> while (1) { >> sleep(2); >> printf("Hello from thread function 1)\n"); >> } >> } >> >> int main() { >> int err; >> pthread_t th_1, th_2, th_3; >> >> err = pthread_create(&th_1, NULL, th_1_func, NULL); >> if (err != 0) >> printf("pthread_create\n"); >> err = pthread_create(&th_2, NULL, th_1_func, NULL); >> if (err != 0) >> printf("pthread_create\n"); >> err = pthread_create(&th_3, NULL, th_1_func, NULL); >> if (err != 0) >> printf("pthread_create\n"); >> while(1) {} >> return 0; >> } >> --- snap --- >> >> Then copy it to your target and start it with the gdbserver. If you let it >> run >> from your host with gdb and try to stop it e.g in the sleep call and then >> try to >> single step it you might see the error. But as I said in this thread the >> behaviour might be different depending on your kernel configuration as I >> encountered different behaviour when enabling FTRACE or SCHED_STAT. >> >> Best regards >> Holger >> > > Hi > > I just tried it on an 885 and on an 8323, it work properly on both targets. > > You can see below the Debug Option that are active on my 8323 target. >
thanks for trying it. Could you completely disable FTRACE? As it also works on my side when I have FTRACE enabled. Best regards Holger