On Wed, 28 Nov 2007 13:28:48 +0100, Arnd Bergmann wrote:
> On Wednesday 28 November 2007, Jan Kratochvil wrote:
> > Please be aware DABR works fine if the same code runs just 1 (always) or
> > 2 (sometimes) threads.  It starts failing with too many threads running:
> > 
> > $ ./dabr-lost
> > TID 32725: DABR 0x1001279f NIP 0xfecf41c
> > TID 32726: DABR 0x1001279f NIP 0xfecf41c
> > TID 32725: hitting the variable
> > variable found = -1, caught TID = 32725
> > TID 32726: hitting the variable
> > variable found = -1, caught TID = 32726
> > The kernel bug did not get reproduced - increase THREADS.
> > 
> > As I did not find any code in that kernel touching DABRX its value should 
> > not
> > be dependent on the number of threads running.
> > 
> 
> Right, this is a different problem from the one reported by Uli.
> From what I can tell, your problem is that you set the DABR only
> in one thread, so the other threads don't see it. DABR is saved
> in the thread_struct, so setting it in one thread doesn't have
> an impact on any other thread.

It even prints out above:
        TID 32725: DABR 0x1001279f NIP 0xfecf41c
        TID 32726: DABR 0x1001279f NIP 0xfecf41c

that it wrote DABR in both the threads and it has also successfully read it
back from each thread specifically (according to its thread-specific TID).

for (threadi = 0; threadi < THREADS; threadi++)
    {
      pid_t tid = thread[threadi];

      setup (tid);
...
    }
static void setup (pid_t tid)
{
...
  l = ptrace (PTRACE_SET_DEBUGREG, tid, NULL, (void *) dabr);
...
}

Also if I would not set DABR specifically for each thread it would not work in
90% of cases for `THREADS == 2'.  And it would not work for `THREADS == 4' if
they are busylooping (therefore not in a syscall).
        TID 596: DABR 0x100127a7 NIP 0x10000dbc
        TID 597: DABR 0x100127a7 NIP 0x10000db0
        TID 598: DABR 0x100127a7 NIP 0x10000dac
        TID 599: DABR 0x100127a7 NIP 0x10000dbc
        TID 596: hitting the variable
        variable found = -1, caught TID = 596
        TID 599: hitting the variable
        variable found = -1, caught TID = 599
        TID 597: hitting the variable
        variable found = -1, caught TID = 597
        TID 598: hitting the variable
        variable found = -1, caught TID = 598
        The kernel bug got workarounded by WORKAROUND_SET_DABR_IN_SYSCALL.

(I found out now WORKAROUND_SET_DABR_IN_SYSCALL only reduces the probability of
the failure, it is not a 100% workaround of the problem in the testcase.)


There is some tricky kernel code around it but I did not try to debug it:

struct task_struct *__switch_to(struct task_struct *prev,
        struct task_struct *new)
{
...
        if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr)) {
                set_dabr(new->thread.dabr);
                __get_cpu_var(current_dabr) = new->thread.dabr;
        }
...
}



Regards,
Jan
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Reply via email to