Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-05 Thread Shailabh Nagar
Chris Sturtivant wrote: > Shailabh Nagar wrote: > >> So here's the sequence of pids being used/hashed etc. Please let >> me know if my assumptions are correct ? >> >> 1. Same listener thread opens 2 sockets >> >> On sockfd1, does a bind() using >> sockaddr_nl.nl_pid = my_pid1 >> On sockfd2, do

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-05 Thread Chris Sturtivant
Shailabh Nagar wrote: So here's the sequence of pids being used/hashed etc. Please let me know if my assumptions are correct ? 1. Same listener thread opens 2 sockets On sockfd1, does a bind() using sockaddr_nl.nl_pid = my_pid1 On sockfd2, does a bind() using sockaddr_nl.nl_pid

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-05 Thread Shailabh Nagar
Jay Lan wrote: Shailabh Nagar wrote: Yes. If no one registers to listen on a particular CPU, data from tasks exiting on that cpu is not sent out at all. Shailabh also wrote: During task exit, kernel goes through each registered listener (small list) and decides which one needs to get thi

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-05 Thread Jay Lan
Shailabh Nagar wrote: > Yes. If no one registers to listen on a particular CPU, data from tasks > exiting on that cpu is not sent out at all. Shailabh also wrote: > During task exit, kernel goes through each registered listener (small > list) and decides which > one needs to get this exit data

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-05 Thread Shailabh Nagar
jamal wrote: > Shailabh, > > On Tue, 2006-04-07 at 12:37 -0400, Shailabh Nagar wrote: > [..] > >>Here's a strawman for the problem we're trying to solve: get >>notification of the close of a NETLINK_GENERIC socket that had >>been used to register interest for some cpus within taskstats. >> >> Fro

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread Paul Jackson
pj wrote: > writes the code gets to Never mind that last incomplete post - I hit Send when I meant to hit Cancel. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsub

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread Paul Jackson
Andrew wrote: > OK, so we're passing in an ASCII string. Fair enough, I think. Paul would > know better. Not sure if I know better - just got stronger opinions. I like the ASCII here - but this is one of those "he who writes the code gets to -- I won't rest till it's the be

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread Paul Jackson
Shailabh wrote: > Perhaps I should use the the other ascii format for specifying cpumasks > since its more amenable > to specifying an upper bound for the length of the ascii string and is > more compact ? Eh - basically - I don't have a strong opinion either way. I have a slight esthetic prefe

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread jamal
Shailabh, On Tue, 2006-04-07 at 12:37 -0400, Shailabh Nagar wrote: [..] > Here's a strawman for the problem we're trying to solve: get > notification of the close of a NETLINK_GENERIC socket that had > been used to register interest for some cpus within taskstats. > > From looking at the netlink

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread Shailabh Nagar
Shailabh Nagar wrote: jamal wrote: On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote: On Mon, 03 Jul 2006 20:54:37 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: What happens when a listener exits without doing deregistration (or if the listener attempts to register another cpumask w

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread Shailabh Nagar
jamal wrote: On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote: On Mon, 03 Jul 2006 20:54:37 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: What happens when a listener exits without doing deregistration (or if the listener attempts to register another cpumask while a current registrat

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-04 Thread jamal
On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote: > On Mon, 03 Jul 2006 20:54:37 -0400 > Shailabh Nagar <[EMAIL PROTECTED]> wrote: > > > > What happens when a listener exits without doing deregistration > > > (or if the listener attempts to register another cpumask while a current > > > regi

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Andrew Morton
On Mon, 03 Jul 2006 20:54:37 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > > What happens when a listener exits without doing deregistration > > (or if the listener attempts to register another cpumask while a current > > registration is still active). > > > ( Jamal, your thoughts on this prob

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Shailabh Nagar
Shailabh Nagar wrote: Andrew Morton wrote: On Fri, 30 Jun 2006 23:37:10 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: Set aside the implementation details and ask "what is a good design"? A kernel-wide constant, whether determined at build-time or by a /proc poke isn't a nice design.

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Andrew Morton
On Mon, 03 Jul 2006 20:13:36 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >>+ if (!s) > >>+ return -ENOMEM; > >>+ s->pid = pid; > >>+ INIT_LIST_HEAD(&s->list); > >>+ > >>+ down_write(sem); > >>+

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Shailabh Nagar
Andrew Morton wrote: On Mon, 03 Jul 2006 17:11:59 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: static inline void taskstats_exit_alloc(struct taskstats **ptidstats) { *ptidstats = NULL; - if (taskstats_has_listeners()) + if (!list_empty(&get_cpu_var(listener_list)))

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Shailabh Nagar
Paul Jackson wrote: Shailabh wrote: I don't know if there are buffer overflow issues in passing a string I don't know if this comment applies to "the standard netlink way of passing it up using NLA_STRING", but the way I deal with buffer length issues in the cpuset code is to insist t

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Andrew Morton
On Mon, 03 Jul 2006 17:11:59 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >>So the strawman is: > >>Listener bind()s to genetlink using its real pid. > >>Sends a separate "registration" message with cpumask to listen to. > >>Kernel stores (real) pid and cpumask. > >>During task exit, kernel

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Shailabh Nagar
Andrew Morton wrote: On Fri, 30 Jun 2006 23:37:10 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: Set aside the implementation details and ask "what is a good design"? A kernel-wide constant, whether determined at build-time or by a /proc poke isn't a nice design. Can we permit userspace

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Paul Jackson
Shailabh wrote: > I don't know if there are buffer overflow > issues in passing a string I don't know if this comment applies to "the standard netlink way of passing it up using NLA_STRING", but the way I deal with buffer length issues in the cpuset code is to insist that the user code express th

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Paul Jackson
Shailabh wrote: > Yes. If no one registers to listen on a particular CPU, data from tasks > exiting on that cpu is not sent out at all. Excellent. > So I chose to use the "cpulist" ascii format that has been helpfully > provided in include/linux/cpumask.h (by whom I wonder :-) Excellent. --

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-03 Thread Shailabh Nagar
Paul Jackson wrote: Shailabh wrote: Sends a separate "registration" message with cpumask to listen to. Kernel stores (real) pid and cpumask. Question: = Ah - good. So this means that I could configure a system with a fork/exit intensive, performance critical job on some dedi

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-07-02 Thread Paul Jackson
Shailabh wrote: > Sends a separate "registration" message with cpumask to listen to. > Kernel stores (real) pid and cpumask. Question: = Ah - good. So this means that I could configure a system with a fork/exit intensive, performance critical job on some dedicated CPUs, and be able to c

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Andrew Morton
On Fri, 30 Jun 2006 23:37:10 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >Set aside the implementation details and ask "what is a good design"? > > > >A kernel-wide constant, whether determined at build-time or by a /proc poke > >isn't a nice design. > > > >Can we permit userspace to send in

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Shailabh Nagar
Andrew Morton wrote: On Fri, 30 Jun 2006 22:20:23 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: If we're going to abuse nl_pid then how about we design things so that nl_pid is treated as two 16-bit words - one word is the start CPU and the other word is the end cpu? Or, if a 65536-CPU l

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Andrew Morton
On Fri, 30 Jun 2006 22:20:23 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >If we're going to abuse nl_pid then how about we design things so that > >nl_pid is treated as two 16-bit words - one word is the start CPU and the > >other word is the end cpu? > > > >Or, if a 65536-CPU limit is too s

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Shailabh Nagar
Andrew Morton wrote: Shailabh Nagar <[EMAIL PROTECTED]> wrote: +/* + * Per-task exit data sent from the kernel to user space + * is tagged by an id based on grouping of cpus. + * + * If userspace specifies a non-zero P as the nl_pid field of + * the sockaddr_nl structure while binding to a n

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Andrew Morton
Shailabh Nagar <[EMAIL PROTECTED]> wrote: > > Based on previous discussions, the above solutions can be expanded/modified > to: > > a) allow userspace to listen to a group of cpus instead of all. Multiple > collection daemons can distribute the load as you pointed out. Doing > collection > by cp

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Andrew Morton
Shailabh Nagar <[EMAIL PROTECTED]> wrote: > > +/* > + * Per-task exit data sent from the kernel to user space > + * is tagged by an id based on grouping of cpus. > + * > + * If userspace specifies a non-zero P as the nl_pid field of > + * the sockaddr_nl structure while binding to a netlink socket,

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread jamal
On Fri, 2006-30-06 at 15:10 -0400, Shailabh Nagar wrote: > > Also to get feedback on this kind of usage of the nl_pid field, the > approach etc. > It does not look unreasonable. I think you may have issues when you have multiple such sockets opened within a single process. But do some testing

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Shailabh Nagar
Shailabh Nagar wrote: > Shailabh Nagar wrote: > > > Index: linux-2.6.17-mm3equiv/kernel/taskstats.c > === > --- linux-2.6.17-mm3equiv.orig/kernel/taskstats.c 2006-06-30 > 11:57:14.0 -0400 > +++ linux-2.6.17-mm3equiv/ke

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Shailabh Nagar
Shailabh Nagar wrote: > Andrew, > > Based on previous discussions, the above solutions can be expanded/modified > to: > > a) allow userspace to listen to a group of cpus instead of all. Multiple > collection daemons can distribute the load as you pointed out. Doing > collection > by cpu groups r

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread Shailabh Nagar
Andrew Morton wrote: > On Thu, 29 Jun 2006 09:44:08 -0700 > Paul Jackson <[EMAIL PROTECTED]> wrote: > > >>>You're probably correct on that model. However, it all depends on the actual >>>workload. Are people who actually have large-CPU (>256) systems actually >>>running fork()-heavy things like web

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-30 Thread jamal
On Thu, 2006-29-06 at 23:01 -0400, Shailabh Nagar wrote: > jamal wrote: > > > > > >>As long as the user is willing to pay the price in terms of memory, > >> > >> > > > >You may wanna draw a line to the upper limit - maybe even allocate slab > >space. > > > > > Didn't quite understand...cou

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
jamal wrote: On Thu, 2006-29-06 at 21:11 -0400, Shailabh Nagar wrote: Andrew Morton wrote: Shailabh Nagar <[EMAIL PROTECTED]> wrote: [..] So if we can detect the silly sustained-high-exit-rate scenario then it seems to me quite legitimate to do some aggressive data reducti

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Paul Jackson
Andrew wrote: > Nah. Stick it in the same cacheline as tasklist_lock (I'm amazed that > we've continued to get away with a global lock for that). Yes - a bit amazing. But no sense compounding the problem now. We shouldn't be adding global locks/modifiable data in the fork/exit code path if we c

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Andrew Morton
On Thu, 29 Jun 2006 19:25:26 -0700 Paul Jackson <[EMAIL PROTECTED]> wrote: > Andrew wrote: > > Like, a single message which says "20,000 sub-millisecond-runtime tasks > > exited in the past second" or something. > > System wide accumulation of such data in the exit() code path still > risks being

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Paul Jackson
Andrew wrote: > Like, a single message which says "20,000 sub-millisecond-runtime tasks > exited in the past second" or something. System wide accumulation of such data in the exit() code path still risks being a bottleneck, just a bit later on. I'm more inclined now to look for ways to disable c

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread jamal
On Thu, 2006-29-06 at 21:11 -0400, Shailabh Nagar wrote: > Andrew Morton wrote: > > >Shailabh Nagar <[EMAIL PROTECTED]> wrote: [..] > >So if we can detect the silly sustained-high-exit-rate scenario then it > >seems to me quite legitimate to do some aggressive data reduction on that. > >Like, a s

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
Andrew Morton wrote: Shailabh Nagar <[EMAIL PROTECTED]> wrote: The rates (or upper bounds) that are being discussed here, as of now, are 1000 exits/sec/CPU for 1024 CPU systems. That would be roughly 1M exits/system * 248Bytes/message = 248 MB/sec. I think it's worth differentiating

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Andrew Morton
Shailabh Nagar <[EMAIL PROTECTED]> wrote: > > The rates (or upper bounds) that are being discussed here, as of now, > are 1000 exits/sec/CPU for > 1024 CPU systems. That would be roughly 1M exits/system * > 248Bytes/message = 248 MB/sec. I think it's worth differentiating between burst rates an

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
jamal wrote: On Thu, 2006-29-06 at 16:01 -0400, Shailabh Nagar wrote: Jamal, any thoughts on the flow control capabilities of netlink that apply here ? Usage of the connection is to supply statistics data to userspace. if you want reliable delivery, then you cant just depend on as

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread jamal
On Thu, 2006-29-06 at 18:13 -0400, Shailabh Nagar wrote: > > And now I remember why I didn't go down that path earlier. Relayfs is one-way > kernel->user and lacks the ability to send query commands from user space > that we need. Either we would need to send commands up through a separate > int

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread jamal
On Thu, 2006-29-06 at 16:01 -0400, Shailabh Nagar wrote: > > Jamal, > any thoughts on the flow control capabilities of netlink that apply here > ? Usage of the connection is to supply statistics data to userspace. > if you want reliable delivery, then you cant just depend on async events from

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
Andrew Morton wrote: >>Yup...the per-cpu, high speed requirements are up relayfs' alley, unless >>Jamal or netlink folks >>are planning something (or can shed light on) how large flows can be >>managed over netlink. I suspect >>this discussion has happened before :-) > > > yeah. And now I rem

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Paul Jackson
Shailabh wrote: > How much memory do these 1024 CPU machines have From: http://www.hpcwire.com/hpc/653963.html (May 12, 2006) SGI has already shipped more than a dozen SGI systems with over a terabyte of memory and about a hundred systems of half a terabyte or larger. But the n

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
Andrew Morton wrote: On Thu, 29 Jun 2006 15:10:31 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: I agree, and I'm viewing this as blocking the taskstats merge. Because if this _is_ a problem then it's a big one because fixing it will be intrusive, and might well involve userspace-visible

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Andrew Morton
On Thu, 29 Jun 2006 15:43:41 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >Could be so. But we need to understand how significant the impact of this > >will be in practice. > > > >We could find, once this is deployed is real production environments on > >large machines that the data loss is

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
Andrew Morton wrote: On Thu, 29 Jun 2006 15:10:31 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: I agree, and I'm viewing this as blocking the taskstats merge. Because if this _is_ a problem then it's a big one because fixing it will be intrusive, and might well involve userspace-visible

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Andrew Morton
On Thu, 29 Jun 2006 15:10:31 -0400 Shailabh Nagar <[EMAIL PROTECTED]> wrote: > >I agree, and I'm viewing this as blocking the taskstats merge. Because if > >this _is_ a problem then it's a big one because fixing it will be > >intrusive, and might well involve userspace-visible changes. > > > >

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Paul Jackson
Shailabh wrote: > First off, just a reminder that this is inherently a netlink flow > control issue...which was being exacerbated earlier by taskstats > decision to send per-tgid data (no longer the case). > > But I'd like to know whats our target here ? How many messages > per second do we want t

Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

2006-06-29 Thread Shailabh Nagar
Andrew Morton wrote: On Thu, 29 Jun 2006 09:44:08 -0700 Paul Jackson <[EMAIL PROTECTED]> wrote: You're probably correct on that model. However, it all depends on the actual workload. Are people who actually have large-CPU (>256) systems actually running fork()-heavy things like webservers o