Re: init's children list is long and slows reaping children.

2007-04-11 Thread Bill Davidsen
Eric W. Biederman wrote: Bill Davidsen <[EMAIL PROTECTED]> writes: As long as the original parent is preserved for getppid(). There are programs out there which communicate between the parent and child with signals, and if the original parent dies, it undesirable to have the child getppid() and

Re: init's children list is long and slows reaping children.

2007-04-11 Thread Oleg Nesterov
On 04/11, Bill Davidsen wrote: > > Oleg Nesterov wrote: > >On 04/10, Eric W. Biederman wrote: > > > >>I'm trying to remember what the story is now. There is a nasty > >>race somewhere with reparenting, a threaded parent setting SIGCHLD to > >>SIGIGN, and non-default signals that results in an zomb

Re: init's children list is long and slows reaping children.

2007-04-11 Thread Eric W. Biederman
Bill Davidsen <[EMAIL PROTECTED]> writes: > As long as the original parent is preserved for getppid(). There are programs > out there which communicate between the parent and child with signals, and if > the original parent dies, it undesirable to have the child getppid() and start > sending signa

Re: init's children list is long and slows reaping children.

2007-04-11 Thread Bill Davidsen
Oleg Nesterov wrote: On 04/10, Eric W. Biederman wrote: I'm trying to remember what the story is now. There is a nasty race somewhere with reparenting, a threaded parent setting SIGCHLD to SIGIGN, and non-default signals that results in an zombie that no one can wait for and reap. It requires

Re: init's children list is long and slows reaping children.

2007-04-11 Thread Nick Piggin
Jeff Garzik wrote: Linus Torvalds wrote: On Fri, 6 Apr 2007, Jeff Garzik wrote: I would rather change the implementation under the hood to start per-CPU threads on demand, similar to a thread-pool implementation. Boxes with $BigNum CPUs probably won't ever use half of those threads. The

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Oeser
On Tuesday 10 April 2007, Jeff Garzik wrote: > Thus, rather than forcing authors to make their code more complex, we > should find another solution. What about sth. like the "pre-forking" concept? So just have a thread creator thread, which checks the amount of unused threads and keeps them with

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Oeser
On Tuesday 10 April 2007, Jeff Garzik wrote: > That's why I feel thread creation -- cheap under Linux -- is quite > appropriate for many of these situations. Maybe that (thread creation) can be done at open(), socket-creation, service request, syscall or whatever event triggers a driver/subsyste

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Davide Libenzi
On Tue, 10 Apr 2007, Bill Davidsen wrote: > Davide Libenzi wrote: > > On Mon, 9 Apr 2007, Linus Torvalds wrote: > > > > > On Mon, 9 Apr 2007, Kyle Moffett wrote: > > > > Maybe "struct posix_process" is more descriptive? "struct > > > > process_posix"? > > > > "Ugly POSIX process semantics data"

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Matt Mackall
On Tue, Apr 10, 2007 at 03:05:56AM -0400, Jeff Garzik wrote: > Andrew Morton wrote: > >: root 3 0.0 0.0 0 0 ?S18:51 0:00 > >[watchdog/0] > > > >That's the softlockup detector. Confusingly named to look like a, err, > >watchdog. Could probably use keventd. > > I

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Oleg Nesterov
On 04/10, Eric W. Biederman wrote: > I'm trying to remember what the story is now. There is a nasty > race somewhere with reparenting, a threaded parent setting SIGCHLD to > SIGIGN, and non-default signals that results in an zombie that no one > can wait for and reap. It requires being reparente

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > >> > so ... is anyone pursuing this? This would allow us to make >> > sys_wait4() faster and more scalable: no tasklist_lock bouncing for >> > example. >> >> which part? > > all of it :) Everything you me

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Molnar
* Eric W. Biederman <[EMAIL PROTECTED]> wrote: > > so ... is anyone pursuing this? This would allow us to make > > sys_wait4() faster and more scalable: no tasklist_lock bouncing for > > example. > > which part? all of it :) Everything you mentioned makes sense quite a bit. The thread signal

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > >> > on a second thought: the p->children list is needed for the whole >> > child/parent task tree, which is needed for sys_getppid(). >> >> Yes, something Oleg said made me realize that. >> >> As long as

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Bill Davidsen
Davide Libenzi wrote: On Mon, 9 Apr 2007, Linus Torvalds wrote: On Mon, 9 Apr 2007, Kyle Moffett wrote: Maybe "struct posix_process" is more descriptive? "struct process_posix"? "Ugly POSIX process semantics data" seems simple enough to stick in a struct name. "struct uglyposix_process"? Gu

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Molnar
* Eric W. Biederman <[EMAIL PROTECTED]> wrote: > > on a second thought: the p->children list is needed for the whole > > child/parent task tree, which is needed for sys_getppid(). > > Yes, something Oleg said made me realize that. > > As long as the reparent isn't to complex it isn't required

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: > No! That is why I suggest (a long ago, in fact) to move ->children into > ->signal_struct. When sub-thread forks, we set ->parent = group_leader. > We don't need forget_original_parent() until the last thead exists. This > also simplify do_wait(). > > Ho

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > >> Ingo Molnar <[EMAIL PROTECTED]> writes: >> >> > no. Two _completely separate_ lists. >> > >> > i.e. a to-be-reaped task will still be on the main list _too_. The >> > main list is for all the PID semant

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Oleg Nesterov
On 04/10, Ingo Molnar wrote: > > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > > > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > > > no. Two _completely separate_ lists. > > > > > > i.e. a to-be-reaped task will still be on the main list _too_. The > > > main list is for all the PID semantic

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Molnar
* Eric W. Biederman <[EMAIL PROTECTED]> wrote: > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > no. Two _completely separate_ lists. > > > > i.e. a to-be-reaped task will still be on the main list _too_. The > > main list is for all the PID semantics rules. The reap-list is just > > for wait4()

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Robin Holt
On Mon, Apr 09, 2007 at 06:48:54PM -0600, Eric W. Biederman wrote: > Andrew Morton <[EMAIL PROTECTED]> writes: > > > I suspect there are quite a few kernel threads which don't really need to > > be threads at all: the code would quite happily work if it was changed to > > use keventd, via schedule

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Ingo Molnar wrote: * Russell King <[EMAIL PROTECTED]> wrote: One per PC card socket to avoid the sysfs locking crappyness that would otherwise deadlock, and to convert from the old unreadable state machine implementation to a much more readable linearly coded implementation. Could probably

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Ingo Molnar
* Russell King <[EMAIL PROTECTED]> wrote: > One per PC card socket to avoid the sysfs locking crappyness that > would otherwise deadlock, and to convert from the old unreadable state > machine implementation to a much more readable linearly coded > implementation. > > Could probably be elimin

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Andrew Morton wrote: Well that obviously would be a dumb way to use keventd. One would need to do schedule_work(), kick off the reset then do schedule_delayed_work() to wait (or poll) for its termination. Far too complex. See what Russell wrote, for instance. When you are in a kernel thread,

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 04:33:57 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > Andrew Morton wrote: > > On Tue, 10 Apr 2007 03:05:56 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > > > >> My main > >> worry with keventd is that we might get stuck behind an unrelated > >> process for an undefined le

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Andrew Morton wrote: On Tue, 10 Apr 2007 03:05:56 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: My main worry with keventd is that we might get stuck behind an unrelated process for an undefined length of time. I don't think it has ever been demonstrated that keventd latency is excessive, or

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Russell King wrote: Could probably be eliminated if we had some mechanism to spawn a helper thread to do some task as required which didn't block other helper threads until it completes. kthread_run() should go that for you. Creates a new thread with kthread_create(), and wakes it up immediat

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Russell King
On Mon, Apr 09, 2007 at 07:30:56PM -0700, Andrew Morton wrote: > : root 319 0.0 0.0 0 0 ?S18:51 0:00 [pccardd] > > hm. One per PC card socket to avoid the sysfs locking crappyness that would otherwise deadlock, and to convert from the old unreadable state machine im

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Andrew Morton
On Tue, 10 Apr 2007 03:05:56 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > My main > worry with keventd is that we might get stuck behind an unrelated > process for an undefined length of time. I don't think it has ever been demonstrated that keventd latency is excessive, or a problem. I gues

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Torsten Kaiser wrote: One thread per port, not per device. 796 ?S 0:00 \_ [scsi_eh_0] 797 ?S 0:00 \_ [scsi_eh_1] 798 ?S 0:00 \_ [scsi_eh_2] 819 ?S 0:00 \_ [scsi_eh_3] 820 ?S 0:00 \_ [scsi_eh_4] 824 ?S 0:00

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Linus Torvalds wrote: On Mon, 9 Apr 2007, Andrew Morton wrote: 10 ?S< 0:00 [khelper] That one's needed to parent the call_usermodehelper() apps. I don't think it does anything else. We used to use keventd for this but that had some problem whcih I forget. I think it was one

Re: init's children list is long and slows reaping children.

2007-04-10 Thread Jeff Garzik
Andrew Morton wrote: : root 3 0.0 0.0 0 0 ?S18:51 0:00 [watchdog/0] That's the softlockup detector. Confusingly named to look like a, err, watchdog. Could probably use keventd. I would think this would run into the keventd "problem", where $N processes can l

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Jeff Garzik
Eric W. Biederman wrote: At 10 kernel threads per cpu there may be a little bloat but it isn't out of control. It is mostly that we are observing the kernel as NR_CPUS approaches infinity. 4096 isn't infinity yet but it's easily a 1000 fold bigger then most people are used to :) I disagree t

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Torsten Kaiser
On 4/10/07, Andrew Morton <[EMAIL PROTECTED]> wrote: : root 299 0.0 0.0 0 0 ?S18:51 0:00 [scsi_eh_0] : root 300 0.0 0.0 0 0 ?S18:51 0:00 [scsi_eh_1] : root 305 0.0 0.0 0 0 ?S18:51 0:00 [scsi_eh_2] : root

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Dave Jones
On Tue, Apr 10, 2007 at 09:07:54AM +0400, Alexey Dobriyan wrote: > On Mon, Apr 09, 2007 at 07:30:56PM -0700, Andrew Morton wrote: > > On Mon, 9 Apr 2007 21:59:12 -0400 Dave Jones <[EMAIL PROTECTED]> wrote: > > [possible topic for KS2007] > > > > 164 ?S< 0:00 [cqueue/0] > > >

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Alexey Dobriyan
On Mon, Apr 09, 2007 at 07:30:56PM -0700, Andrew Morton wrote: > On Mon, 9 Apr 2007 21:59:12 -0400 Dave Jones <[EMAIL PROTECTED]> wrote: [possible topic for KS2007] > > 164 ?S< 0:00 [cqueue/0] > > 165 ?S< 0:00 [cqueue/1] > > > > I'm not even sure wth these are. > > Me

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Linus Torvalds
On Mon, 9 Apr 2007, Andrew Morton wrote: > > >10 ?S< 0:00 [khelper] > > That one's needed to parent the call_usermodehelper() apps. I don't think > it does anything else. We used to use keventd for this but that had some > problem whcih I forget. I think it was one of a long

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Andrew Morton
On Mon, 9 Apr 2007 21:59:12 -0400 Dave Jones <[EMAIL PROTECTED]> wrote: > On Mon, Apr 09, 2007 at 05:23:39PM -0700, Andrew Morton wrote: > > > I suspect there are quite a few kernel threads which don't really need to > > be threads at all: the code would quite happily work if it was changed to

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Dave Jones
On Mon, Apr 09, 2007 at 05:23:39PM -0700, Andrew Morton wrote: > I suspect there are quite a few kernel threads which don't really need to > be threads at all: the code would quite happily work if it was changed to > use keventd, via schedule_work() and friends. But kernel threads are > somew

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Andrew Morton
On Mon, 09 Apr 2007 18:48:54 -0600 [EMAIL PROTECTED] (Eric W. Biederman) wrote: > Andrew Morton <[EMAIL PROTECTED]> writes: > > > I suspect there are quite a few kernel threads which don't really need to > > be threads at all: the code would quite happily work if it was changed to > > use keventd

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Eric W. Biederman
Andrew Morton <[EMAIL PROTECTED]> writes: > I suspect there are quite a few kernel threads which don't really need to > be threads at all: the code would quite happily work if it was changed to > use keventd, via schedule_work() and friends. But kernel threads are > somewhat easier to code for. >

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Andrew Morton
On Fri, 06 Apr 2007 18:38:40 -0400 Jeff Garzik <[EMAIL PROTECTED]> wrote: > Robin Holt wrote: > > We have been testing a new larger configuration and we are seeing a very > > large scan time of init's tsk->children list. In the cases we are seeing, > > there are numerous kernel processes created

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Davide Libenzi
On Mon, 9 Apr 2007, Linus Torvalds wrote: > On Mon, 9 Apr 2007, Kyle Moffett wrote: > > > > Maybe "struct posix_process" is more descriptive? "struct process_posix"? > > "Ugly POSIX process semantics data" seems simple enough to stick in a struct > > name. "struct uglyposix_process"? > > Guys,

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Eric W. Biederman
Kyle Moffett <[EMAIL PROTECTED]> writes: > Maybe "struct posix_process" is more descriptive? "struct process_posix"? > "Ugly POSIX process semantics data" seems simple enough to stick in a struct > name. "struct uglyposix_process"? Nack. Linux internally doesn't have processes it has tasks wi

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Linus Torvalds
On Mon, 9 Apr 2007, Kyle Moffett wrote: > > Maybe "struct posix_process" is more descriptive? "struct process_posix"? > "Ugly POSIX process semantics data" seems simple enough to stick in a struct > name. "struct uglyposix_process"? Guys, you didn't read my message. It's *not* about "process

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Kyle Moffett
On Apr 09, 2007, at 14:09:51, Bill Davidsen wrote: Ingo Molnar wrote: * Linus Torvalds <[EMAIL PROTECTED]> wrote: On Fri, 6 Apr 2007, Davide Libenzi wrote: or lets just face it and name it what it is: process_struct ;-) That'd be fine too! Wonder if Linus would swallow a rename patch like th

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Bill Davidsen
Ingo Molnar wrote: * Linus Torvalds <[EMAIL PROTECTED]> wrote: On Fri, 6 Apr 2007, Davide Libenzi wrote: or lets just face it and name it what it is: process_struct ;-) That'd be fine too! Wonder if Linus would swallow a rename patch like that... I don't really see the point. It's not even *t

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Chris Snook
Eric W. Biederman wrote: Linus Torvalds <[EMAIL PROTECTED]> writes: I'm not sure anybody would really be unhappy with pptr pointing to some magic and special task that has pid 0 (which makes it clear to everybody that the parent is something special), and that has SIGCHLD set to SIG_IGN (which

Re: init's children list is long and slows reaping children.

2007-04-09 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Fri, 6 Apr 2007, Davide Libenzi wrote: > > > > > > or lets just face it and name it what it is: process_struct ;-) > > > > That'd be fine too! Wonder if Linus would swallow a rename patch like > > that... > > I don't really see the point. It's

Re: init's children list is long and slows reaping children.

2007-04-08 Thread Oleg Nesterov
On 04/07, Eric W. Biederman wrote: > > Oleg Nesterov <[EMAIL PROTECTED]> writes: > > > On 04/06, Oleg Nesterov wrote: > >> > >> @@ -275,10 +275,7 @@ static void reparent_to_init(void) > >>remove_parent(current); > >>current->parent = child_reaper(current); > >>current->real_parent = chi

Re: init's children list is long and slows reaping children.

2007-04-07 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: > On 04/06, Oleg Nesterov wrote: >> >> Perhaps, >> >> --- t/kernel/exit.c~ 2007-04-06 23:31:31.0 +0400 >> +++ t/kernel/exit.c 2007-04-06 23:31:57.0 +0400 >> @@ -275,10 +275,7 @@ static void reparent_to_init(void) >> remove_parent(cu

Re: init's children list is long and slows reaping children.

2007-04-07 Thread Oleg Nesterov
On 04/06, Oleg Nesterov wrote: > > Perhaps, > > --- t/kernel/exit.c~ 2007-04-06 23:31:31.0 +0400 > +++ t/kernel/exit.c 2007-04-06 23:31:57.0 +0400 > @@ -275,10 +275,7 @@ static void reparent_to_init(void) > remove_parent(current); > current->parent = child_reaper(cu

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Jeff Garzik
Linus Torvalds wrote: On Fri, 6 Apr 2007, Jeff Garzik wrote: I would rather change the implementation under the hood to start per-CPU threads on demand, similar to a thread-pool implementation. Boxes with $BigNum CPUs probably won't ever use half of those threads. The counter-argument is tha

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Davide Libenzi
On Fri, 6 Apr 2007, Linus Torvalds wrote: > > > On Fri, 6 Apr 2007, Davide Libenzi wrote: > > > On Fri, 6 Apr 2007, Linus Torvalds wrote: > > > > > > I don't really see the point. It's not even *true*. A "process" includes > > > more than the shared signal-handling - it would include files an

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Linus Torvalds
On Fri, 6 Apr 2007, Jeff Garzik wrote: > > I would rather change the implementation under the hood to start per-CPU > threads on demand, similar to a thread-pool implementation. > > Boxes with $BigNum CPUs probably won't ever use half of those threads. The counter-argument is that boxes with $

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Linus Torvalds
On Fri, 6 Apr 2007, Davide Libenzi wrote: > On Fri, 6 Apr 2007, Linus Torvalds wrote: > > > > I don't really see the point. It's not even *true*. A "process" includes > > more than the shared signal-handling - it would include files and fs etc > > too. > > > > So it's actually *more* correct

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Jeff Garzik
Robin Holt wrote: We have been testing a new larger configuration and we are seeing a very large scan time of init's tsk->children list. In the cases we are seeing, there are numerous kernel processes created for each cpu (ie: events/0 ... events/, xfslogd/0 ... xfslogd/). These are all on the

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Davide Libenzi
On Fri, 6 Apr 2007, Linus Torvalds wrote: > On Fri, 6 Apr 2007, Davide Libenzi wrote: > > > > > > or lets just face it and name it what it is: process_struct ;-) > > > > That'd be fine too! Wonder if Linus would swallow a rename patch like > > that... > > I don't really see the point. It's not

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Linus Torvalds
On Fri, 6 Apr 2007, Davide Libenzi wrote: > > > > or lets just face it and name it what it is: process_struct ;-) > > That'd be fine too! Wonder if Linus would swallow a rename patch like > that... I don't really see the point. It's not even *true*. A "process" includes more than the shared

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Davide Libenzi
On Fri, 6 Apr 2007, Ingo Molnar wrote: > > * Davide Libenzi wrote: > > > > > Ohhh, the "signal" struct! Funny name for something that nowadays > > > > has probably no more than a 5% affinity with signal-related tasks > > > > :/ > > > > > > Hmm. I wonder if we should just rename it the struc

Re: init's children list is long and slows reaping children.

2007-04-06 Thread H. Peter Anvin
Jeremy Fitzhardinge wrote: Eric W. Biederman wrote: I'm guessing the issue is nash just calls wait and doesn't check the returned pid value, assuming it is the only child it forked returning. Which is valid except when you are running as pid == 1. Hm, that's always a bug; a process can alwa

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Jeremy Fitzhardinge
Eric W. Biederman wrote: > I'm guessing the issue is nash just calls wait and doesn't check the > returned pid value, assuming it is the only child it forked returning. > Which is valid except when you are running as pid == 1. > Hm, that's always a bug; a process can always have children it doe

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Oleg Nesterov <[EMAIL PROTECTED]> wrote: > Probably it is I who missed something :) > > But why can't we do both changes? I think it is just ugly to use init > to reap the kernel thread. Ok, wait4() can find zombie quickly if we > do the ->children split. But /sbin/init could be swapped out,

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Ingo Molnar wrote: > > * Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > > I'd almost prefer to just not add kernel threads to any parent > > > process list *at*all*. > > > > Yes sure, I didn't argue with that. However, "->exit_state = -1" does > > matter, we can't detach process unle

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: > On 04/06, Oleg Nesterov wrote: >> >> --- t/kernel/exit.c~ 2007-04-06 23:31:31.0 +0400 >> +++ t/kernel/exit.c 2007-04-06 23:31:57.0 +0400 >> @@ -275,10 +275,7 @@ static void reparent_to_init(void) >> remove_parent(current); >> c

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Oleg Nesterov wrote: > > --- t/kernel/exit.c~ 2007-04-06 23:31:31.0 +0400 > +++ t/kernel/exit.c 2007-04-06 23:31:57.0 +0400 > @@ -275,10 +275,7 @@ static void reparent_to_init(void) > remove_parent(current); > current->parent = child_reaper(current); >

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > I'd almost prefer to just not add kernel threads to any parent > > process list *at*all*. > > Yes sure, I didn't argue with that. However, "->exit_state = -1" does > matter, we can't detach process unless we make it auto-reap. > Off course, we al

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Christoph Hellwig <[EMAIL PROTECTED]> writes: > As all kernel thread (1) should be converted to kthread anyway for > proper containers support and general "let's get rid of a crappy API' > cleanups I think that's enough. It would be nice to have SGI helping > to convert more drivers over to the p

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Linus Torvalds wrote: > > On Fri, 6 Apr 2007, Oleg Nesterov wrote: > > > > Oops. I misread stop_machine(), it does kernel_thread(), not > > kthread_create(). > > So "stopmachine" threads are all re-parented to init when the caller exits. > > I think it makes sense to set ->exit_state =

Re: init's children list is long and slows reaping children.

2007-04-06 Thread H. Peter Anvin
Eric W. Biederman wrote: Are you saying waitpid() (wait4) *with a pid specified* can return another pid? That definitely sounds like a bug. No. For the full context look back a couple of messages. I'm guessing the issue is nash just calls wait and doesn't check the returned pid value, assum

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Davide Libenzi wrote: > > > Ohhh, the "signal" struct! Funny name for something that nowadays > > > has probably no more than a 5% affinity with signal-related tasks > > > :/ > > > > Hmm. I wonder if we should just rename it the struct thread_group, > > or struct task_group. Those seem s

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Davide Libenzi
On Fri, 6 Apr 2007, Eric W. Biederman wrote: > Davide Libenzi writes: > > > On Fri, 6 Apr 2007, Oleg Nesterov wrote: > > > >> Sure. It would be nice to move ->children into signal_struct at first. > >> Except this change breaks (in fact fixes) ->pdeath_signal behaviour. > > > > Ohhh, the "signal

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
"H. Peter Anvin" <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: >> >> Oleg is coming from a different case where it was found that exiting kernel >> threads were causing problems for nash when nash was run as init in an >> initramfs. While I think that case is likely a user space bug beca

Re: init's children list is long and slows reaping children.

2007-04-06 Thread H. Peter Anvin
Eric W. Biederman wrote: Oleg is coming from a different case where it was found that exiting kernel threads were causing problems for nash when nash was run as init in an initramfs. While I think that case is likely a user space bug because nash should check the pid from waidpid before assumin

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Davide Libenzi writes: > On Fri, 6 Apr 2007, Oleg Nesterov wrote: > >> Sure. It would be nice to move ->children into signal_struct at first. >> Except this change breaks (in fact fixes) ->pdeath_signal behaviour. > > Ohhh, the "signal" struct! Funny name for something that nowadays has > probab

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > no. Two _completely separate_ lists. > > i.e. a to-be-reaped task will still be on the main list _too_. The main > list is for all the PID semantics rules. The reap-list is just for > wait4() processing. The two would be completely separate. And what pr

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Davide Libenzi
On Fri, 6 Apr 2007, Oleg Nesterov wrote: > Sure. It would be nice to move ->children into signal_struct at first. > Except this change breaks (in fact fixes) ->pdeath_signal behaviour. Ohhh, the "signal" struct! Funny name for something that nowadays has probably no more than a 5% affinity with

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Roland Dreier <[EMAIL PROTECTED]> writes: > > no. Two _completely separate_ lists. > > > > i.e. a to-be-reaped task will still be on the main list _too_. The main > > list is for all the PID semantics rules. The reap-list is just for > > wait4() processing. The two would be completely sepa

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Christoph Hellwig
On Thu, Apr 05, 2007 at 06:29:16PM -0700, Linus Torvalds wrote: > > The support angel on my shoulder says we should just put all the kernel > > threads under a kthread subtree to shorten init's child list and minimize > > impact. > > A number are already there, of course, since they use the kthrea

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: > On 04/06, Eric W. Biederman wrote: >> >> Thinking about it I do agree with Linus that two lists sounds like the >> right solution because it ensures we always have O(1) time when >> waiting for a zombie. > > Well. I bet this will be painful, and will ugl

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Roland Dreier
> no. Two _completely separate_ lists. > > i.e. a to-be-reaped task will still be on the main list _too_. The main > list is for all the PID semantics rules. The reap-list is just for > wait4() processing. The two would be completely separate. I guess this means we add another list head to

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Linus Torvalds <[EMAIL PROTECTED]> writes: > On Fri, 6 Apr 2007, Oleg Nesterov wrote: >> >> Oops. I misread stop_machine(), it does kernel_thread(), not >> kthread_create(). >> So "stopmachine" threads are all re-parented to init when the caller exits. >> I think it makes sense to set ->exit_sta

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > Thinking about it I do agree with Linus that two lists sounds like > > the right solution because it ensures we always have O(1) time when > > waiting for a zombie. > > Well. I bet this will be painful, and will uglify the code even more. > > do_

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > putting the freshly reaped tasks at the 'head' of the list is just a > fancy (and incomplete) way of splitting the list up into two lists, and > i'd advocate a clean split. Just like have have split the ptrace_list

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > I'd almost prefer to just not add kernel threads to any parent process > list *at*all*. i think part of the problem is the legacy that the list is artificially unified: tasks that 'will possibly exit' are on the same list as tasks that 'have alrea

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Eric W. Biederman wrote: > > Thinking about it I do agree with Linus that two lists sounds like the > right solution because it ensures we always have O(1) time when > waiting for a zombie. Well. I bet this will be painful, and will uglify the code even more. do_wait() has to iterate ov

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Linus Torvalds
On Fri, 6 Apr 2007, Oleg Nesterov wrote: > > Oops. I misread stop_machine(), it does kernel_thread(), not kthread_create(). > So "stopmachine" threads are all re-parented to init when the caller exits. > I think it makes sense to set ->exit_state = -1 in stopmachine(), regadless > of any other c

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Eric W. Biederman wrote: > > Oleg Nesterov <[EMAIL PROTECTED]> writes: > > >> At first glance your patch looks reasonable. > >> > >> Unfortunately it only applies to the rare thread that calls daemonize, > >> and not also to kernel/kthread/kthread() which means it will miss many of > >>

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Robin Holt
On Fri, Apr 06, 2007 at 09:38:24AM -0600, Eric W. Biederman wrote: > How hard is tasklist_lock hit on these systems? The major hold-off we are seeing is from tasks reaping children, especially tasks with very large children lists. > How hard is the pid hash hit on these systems? In the little bi

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: >> At first glance your patch looks reasonable. >> >> Unfortunately it only applies to the rare thread that calls daemonize, >> and not also to kernel/kthread/kthread() which means it will miss many of >> our current kernel threads. > > Note that a thread

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Robin Holt <[EMAIL PROTECTED]> writes: >> So I think we have some options once we get the kernel threads out >> of the way. Getting the kernel threads out of the way would seem >> to be the first priority. > > I think both avenues would probably be the right way to proceeed. > Getting kthreads to

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Robin Holt
> So I think we have some options once we get the kernel threads out > of the way. Getting the kernel threads out of the way would seem > to be the first priority. I think both avenues would probably be the right way to proceeed. Getting kthreads to not be parented by init would be an opportunity

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
On 04/06, Eric W. Biederman wrote: > > Oleg Nesterov <[EMAIL PROTECTED]> writes: > > > Robin Holt wrote: > >> > >> wait_task_zombie() is taking many seconds to get through the list. > >> For the case of a modprobe, stop_machine creates one thread per cpu > >> (remember big number). All are parente

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes: > Robin Holt wrote: >> >> wait_task_zombie() is taking many seconds to get through the list. >> For the case of a modprobe, stop_machine creates one thread per cpu >> (remember big number). All are parented to init and their exit will >> cause wait_task_zo

Re: init's children list is long and slows reaping children.

2007-04-06 Thread Oleg Nesterov
Robin Holt wrote: > > wait_task_zombie() is taking many seconds to get through the list. > For the case of a modprobe, stop_machine creates one thread per cpu > (remember big number). All are parented to init and their exit will > cause wait_task_zombie to scan multiple times most of the way throug

Re: init's children list is long and slows reaping children.

2007-04-05 Thread Eric W. Biederman
Linus Torvalds <[EMAIL PROTECTED]> writes: > On Thu, 5 Apr 2007, Chris Snook wrote: > >> Linus Torvalds wrote: >> >> > Another thing we could do is to just make sure that kernel threads simply >> > don't end up as children of init. That whole thing is silly, they're really >> > not children of the

Re: init's children list is long and slows reaping children.

2007-04-05 Thread Linus Torvalds
On Thu, 5 Apr 2007, Chris Snook wrote: > Linus Torvalds wrote: > > > Another thing we could do is to just make sure that kernel threads simply > > don't end up as children of init. That whole thing is silly, they're really > > not children of the user-space init anyway. Comments? > > Does anyon

Re: init's children list is long and slows reaping children.

2007-04-05 Thread Chris Snook
Chris Snook wrote: Linus Torvalds wrote: On Thu, 5 Apr 2007, Robin Holt wrote: For testing, Jack Steiner create the following patch. All it does is moves tasks which are transitioning to the zombie state from where they are in the children list to the head of the list. In this way, they will

Re: init's children list is long and slows reaping children.

2007-04-05 Thread Chris Snook
Linus Torvalds wrote: On Thu, 5 Apr 2007, Robin Holt wrote: For testing, Jack Steiner create the following patch. All it does is moves tasks which are transitioning to the zombie state from where they are in the children list to the head of the list. In this way, they will be the first found

Re: init's children list is long and slows reaping children.

2007-04-05 Thread Linus Torvalds
On Thu, 5 Apr 2007, Robin Holt wrote: > > For testing, Jack Steiner create the following patch. All it does > is moves tasks which are transitioning to the zombie state from where > they are in the children list to the head of the list. In this way, > they will be the first found and reaping d

init's children list is long and slows reaping children.

2007-04-05 Thread Robin Holt
We have been testing a new larger configuration and we are seeing a very large scan time of init's tsk->children list. In the cases we are seeing, there are numerous kernel processes created for each cpu (ie: events/0 ... events/, xfslogd/0 ... xfslogd/). These are all on the list ahead of the p

  1   2   >