Re: debug problems on ppc 83xx target due to changed struct task_struct

Holger Brunck Wed, 17 Aug 2016 08:00:39 -0700

On 15/08/16 18:19, Dave Hansen wrote:
> On 08/15/2016 07:35 AM, Holger Brunck wrote:
>> I tried this but unfortunately the error only occurs while remote debugging.
>> Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
>> target which is also 32-bit and it ends up with the same behaviour.
>>
>> I was also investigating where I have to move the line in the struct 
>> task_struct
>> and it turns out to be like this (diff to 4.7 kernel):
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 253538f..4868874 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1655,7 +1655,9 @@ struct task_struct {
>>         struct signal_struct *signal;
>>         struct sighand_struct *sighand;
>>
>> +       // struct thread_struct thread;   // until here everything is fine
>>         sigset_t blocked, real_blocked;
>> +       struct thread_struct thread;      // from here it's broken
>>         sigset_t saved_sigmask; /* restored if set_restore_sigmask() was 
>> used */
>>         struct sigpending pending;
> 
> Wow, thanks for all the debugging here!
> 
> So, we know it has to do with signals, thread_info, and probably only
> affects 32-bit powerpc.  Seems awfully weird.  Have you checked with any
> of the 64-bit powerpc guys to see if they have any ideas?
> 
> I went grepping around for a bit.
> 
> Where is the task_struct stored?  Is it on-stack on ppc32 or something?
>  The thread_info is, I assume, but I see some THREAD_INFO vs. THREAD
> (thread struct) math happening in here, which confuses me:
> 
>         .globl  ret_from_debug_exc
> ret_from_debug_exc:
>         mfspr   r9,SPRN_SPRG_THREAD
>         lwz     r10,SAVED_KSP_LIMIT(r1)
>         stw     r10,KSP_LIMIT(r9)
>         lwz     r9,THREAD_INFO-THREAD(r9)
>         CURRENT_THREAD_INFO(r10, r1)
>         lwz     r10,TI_PREEMPT(r10)
>         stw     r10,TI_PREEMPT(r9)
>         RESTORE_xSRR(SRR0,SRR1);
>         RESTORE_xSRR(CSRR0,CSRR1);
>         RESTORE_MMU_REGS;
>         RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)
>


yeah but here you are in arch/powerpc/kernel/head_booke.h and IIUIC my
architecture uses arch/powerpc/kernel/head_32.S

Some small updates from my side. I was able to simplify the setup. I don't need
my (quite complex) application. I now have a small C - program which starts
three threads simply running a while loop and doing printouts every now and
then. And I still can reproduce the error. So we simply need threads a gdbserver
session, ppc32 and a single step while the threads are already running.

I added some debug prints within the kernel common signal code and the specific
powerpc code and enabled the debug trace from the gdbserver. What I see is that
gdbserver sends for each thread a SIGSTOP to the kernel and waits for a
response. The kernel does receive all the signals but only respond to some of
them in the error case. Which then matches with my "ps" output as I see that
some threads are not in the state pthread_stop and then the gdbserver gets
suspended. I think the interesting part is in arch/powerpc/kernel/signal_32.c
with it's function handle_signal32. For the threads successfully stopped this
function is called once. If the kernel receives two SIGSTOP before calling the
function we end up in the error case. Now my question does anyone know if this
function should handle several pending signals at once if present or will it be
called once per signal?

> But, I'm really at a loss to explain this.  It still seems like a deeply
> ppc-specific issue.  We can obviously work around it with an #ifdef for
> your platform, but that's awfully hackish and hides the real bug,
> whatever it is.
> 

what I could do is to reuse the define CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and
if it's not set I use the thread_struct at it's old position. I know it would
only mask the error and I guess it's not acceptable for mainline but as my time
is limited I could live with such an OOT patch for my board.

Best regards
Holger

Re: debug problems on ppc 83xx target due to changed struct task_struct

Reply via email to