On 02/26, Jiri Slaby wrote:
>
> On 10. 01. 19, 18:52, Andrei Vagin wrote:
> > --- a/kernel/exit.c
> > +++ b/kernel/exit.c
> > @@ -558,12 +558,14 @@ static struct task_struct *find_alive_thread(struct 
> > task_struct *p)
> >     return NULL;
> >  }
> >
> > -static struct task_struct *find_child_reaper(struct task_struct *father)
> > +static struct task_struct *find_child_reaper(struct task_struct *father,
> > +                                           struct list_head *dead)
> >     __releases(&tasklist_lock)
> >     __acquires(&tasklist_lock)
> >  {
> >     struct pid_namespace *pid_ns = task_active_pid_ns(father);
> >     struct task_struct *reaper = pid_ns->child_reaper;
> > +   struct task_struct *p, *n;
> >
> >     if (likely(reaper != father))
> >             return reaper;
> > @@ -579,6 +581,12 @@ static struct task_struct *find_child_reaper(struct 
> > task_struct *father)
> >             panic("Attempted to kill init! exitcode=0x%08x\n",
> >                     father->signal->group_exit_code ?: father->exit_code);
> >     }
> > +
> > +   list_for_each_entry_safe(p, n, dead, ptrace_entry) {
> > +           list_del_init(&p->ptrace_entry);
> > +           release_task(p);
> > +   }
> > +
>
> Hi,
>
> from our (SUSE) QA we received a report that this patch causes a
> performance decline in libmicro pthread_* benchmark as reported in:
> https://bugzilla.suse.com/show_bug.cgi?id=1126762

Access Denied

> I tried myself from the repo:
> https://github.com/redhat-performance/libMicro
>
> I ran
> pthread_create -B 8 -C 200 -S
>
> and with the patch applied:
> # STATISTICS       usecs/call (raw)          usecs/call (outliers removed)
> #                   mean     23.38611                17.29311
>
> Without:
> #                   mean     41.36539                39.21347

can't reproduce, I see the same numbers with or without this patch.
However, I did "./bin/pthread_create -B 8 -C 200 -S" under KVM.

> The benchmark seems to create 8 (-B above) pthreads, does lock/unlock in
> them and then the threads exit. The benchmark reaps the threads via
> pthread_join. This all happens 200 times (-C above).

Given that this test-case doesn't use CLONE_PID, I fail to understand how
this patch can make any noticeable difference performance wise...

with this patch forget_original_parent() just passes the additional argument
to find_child_reaper(), nothing else.

The extra list_for_each_entry_safe/release_task loop can't happen, and even
if it could it shouldn't cause any performance regression too.

> Any idea how to restore the performance close to the previous state?

maybe you can try perf to find out where does this difference come from?

Oleg.

Reply via email to