Jamal Hadi Salim wrote:
On Tue, 2006-11-07 at 16:57 -0700, Randy.Dunlap wrote:
so make it a patch to Documentation/networking/...
I was going to when it got in better shape. Good suggestion, I will do
this soon and put it there as a patch.
I have some doc corrections, Jamal. Do I send
n.
This ensures that even if the oom-killer is called, no other task's
exit is blocked as it can still acquire another down_read.
Thanks to Andrew Morton & Herbert Xu for pointing out the oom
related pitfalls, and to Chandra Seetharaman for suggesting this
fix instead of using som
Herbert Xu wrote:
> On Tue, Jul 11, 2006 at 03:57:31AM -0700, Andrew Morton wrote:
>
> down_write(&listeners->sem);
> list_for_each_entry_safe(s, tmp, &listeners->list, list) {
>- ret = genlmsg_unicast(skb, s->pid);
>+ skb_next = NULL;
>+
icast() frees
up the skb passed to it, regardless of status of the send, reuse is bad.
Thanks to Chandra Seetharaman for discovering this bug.
Signed-Off-By: Shailabh Nagar <[EMAIL PROTECTED]>
Signed-Off-By: Chandra Seetharaman <[EMAIL PROTECTED]>
kernel/taskstats.c | 13 +
Andrew Morton wrote:
> Thomas Graf <[EMAIL PROTECTED]> wrote:
>
>>* Shailabh Nagar <[EMAIL PROTECTED]> 2006-07-06 07:37
>>
>>>@@ -37,9 +45,26 @@ static struct nla_policy taskstats_cmd_g
>>> __read_mostly = {
>>> [TASKSTATS_CMD_ATTR_PID]
On Thu, 2006-07-06 at 14:08 +0200, Thomas Graf wrote:
> * Shailabh Nagar <[EMAIL PROTECTED]> 2006-07-06 07:37
> > @@ -37,9 +45,26 @@ static struct nla_policy taskstats_cmd_g
> > __read_mostly = {
> > [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_U32 },
> > [T
, its taskstats data is unicast to each listener
interested in that cpu.
Thanks to Andrew Morton for pointing out the various scalability and
general concerns of previous attempts and for suggesting this design.
Signed-Off-By: Shailabh Nagar <[EMAIL PROTECTED]>
---
Addresses various comme
On Thu, 2006-07-06 at 02:56 -0700, Andrew Morton wrote:
> On Thu, 06 Jul 2006 05:28:35 -0400
> Shailabh Nagar <[EMAIL PROTECTED]> wrote:
>
> > On systems with a large number of cpus, with even a modest rate of
> > tasks exiting per cpu, the volume of taskstats data
, its taskstats data is unicast to each listener
interested in that cpu.
Thanks to Andrew Morton for pointing out the various scalability and
general concerns of previous attempts and for suggesting this design.
Signed-Off-By: Shailabh Nagar <[EMAIL PROTECTED]>
---
Fixes comments by akpm:
Chris Sturtivant wrote:
> Shailabh Nagar wrote:
>
>> So here's the sequence of pids being used/hashed etc. Please let
>> me know if my assumptions are correct ?
>>
>> 1. Same listener thread opens 2 sockets
>>
>> On sockfd1, does a bind() using
&g
Jay Lan wrote:
Shailabh Nagar wrote:
Yes. If no one registers to listen on a particular CPU, data from tasks
exiting on that cpu is not sent out at all.
Shailabh also wrote:
During task exit, kernel goes through each registered listener (small
list) and decides which
one needs to get
jamal wrote:
> Shailabh,
>
> On Tue, 2006-04-07 at 12:37 -0400, Shailabh Nagar wrote:
> [..]
>
>>Here's a strawman for the problem we're trying to solve: get
>>notification of the close of a NETLINK_GENERIC socket that had
>>been used to regis
Shailabh Nagar wrote:
jamal wrote:
On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote:
On Mon, 03 Jul 2006 20:54:37 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
What happens when a listener exits without doing deregistration
(or if the listener attempts to register another c
jamal wrote:
On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote:
On Mon, 03 Jul 2006 20:54:37 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
What happens when a listener exits without doing deregistration
(or if the listener attempts to register another cpumask while a c
Shailabh Nagar wrote:
Andrew Morton wrote:
On Fri, 30 Jun 2006 23:37:10 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
Set aside the implementation details and ask "what is a good design"?
A kernel-wide constant, whether determined at build-time or by a
/proc poke
isn
Andrew Morton wrote:
On Mon, 03 Jul 2006 17:11:59 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
static inline void taskstats_exit_alloc(struct taskstats **ptidstats)
{
*ptidstats = NULL;
- if (taskstats_has_listeners())
+ if (!list_empty(&get_cpu_var(lis
Paul Jackson wrote:
Shailabh wrote:
I don't know if there are buffer overflow
issues in passing a string
I don't know if this comment applies to "the standard netlink way of
passing it up using NLA_STRING", but the way I deal with buffer length
issues in the cpuset code is to insist t
Andrew Morton wrote:
On Fri, 30 Jun 2006 23:37:10 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
Set aside the implementation details and ask "what is a good design"?
A kernel-wide constant, whether determined at build-time or by a /proc poke
isn't a nice design.
Ca
Paul Jackson wrote:
Shailabh wrote:
Sends a separate "registration" message with cpumask to listen to.
Kernel stores (real) pid and cpumask.
Question:
=
Ah - good.
So this means that I could configure a system with a fork/exit
intensive, performance critical job on some dedi
Andrew Morton wrote:
On Fri, 30 Jun 2006 22:20:23 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
If we're going to abuse nl_pid then how about we design things so that
nl_pid is treated as two 16-bit words - one word is the start CPU and the
other word is the end cpu?
Or, if
Andrew Morton wrote:
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
+/*
+ * Per-task exit data sent from the kernel to user space
+ * is tagged by an id based on grouping of cpus.
+ *
+ * If userspace specifies a non-zero P as the nl_pid field of
+ * the sockaddr_nl structure while bindin
Shailabh Nagar wrote:
> Shailabh Nagar wrote:
>
>
> Index: linux-2.6.17-mm3equiv/kernel/taskstats.c
> ===
> --- linux-2.6.17-mm3equiv.orig/kernel/taskstats.c 2006-06-30
> 11:57:14.0 -0400
> +
Shailabh Nagar wrote:
> Andrew,
>
> Based on previous discussions, the above solutions can be expanded/modified
> to:
>
> a) allow userspace to listen to a group of cpus instead of all. Multiple
> collection daemons can distribute the load as you pointed out. Doing
> co
Andrew Morton wrote:
> On Thu, 29 Jun 2006 09:44:08 -0700
> Paul Jackson <[EMAIL PROTECTED]> wrote:
>
>
>>>You're probably correct on that model. However, it all depends on the actual
>>>workload. Are people who actually have large-CPU (>256) systems actually
>>>running fork()-heavy things like web
jamal wrote:
On Thu, 2006-29-06 at 21:11 -0400, Shailabh Nagar wrote:
Andrew Morton wrote:
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
[..]
So if we can detect the silly sustained-high-exit-rate scenario then it
seems to me quite legitimate to do some aggressiv
Andrew Morton wrote:
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
The rates (or upper bounds) that are being discussed here, as of now,
are 1000 exits/sec/CPU for
1024 CPU systems. That would be roughly 1M exits/system *
248Bytes/message = 248 MB/sec.
I think it
jamal wrote:
On Thu, 2006-29-06 at 16:01 -0400, Shailabh Nagar wrote:
Jamal,
any thoughts on the flow control capabilities of netlink that apply here
? Usage of the connection is to supply statistics data to userspace.
if you want reliable delivery, then you cant just depend on
Andrew Morton wrote:
>>Yup...the per-cpu, high speed requirements are up relayfs' alley, unless
>>Jamal or netlink folks
>>are planning something (or can shed light on) how large flows can be
>>managed over netlink. I suspect
>>this discussion has happened before :-)
>
>
> yeah.
And now I rem
Andrew Morton wrote:
On Thu, 29 Jun 2006 15:10:31 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
I agree, and I'm viewing this as blocking the taskstats merge. Because if
this _is_ a problem then it's a big one because fixing it will be
intrusive, and might well involve u
Andrew Morton wrote:
On Thu, 29 Jun 2006 15:10:31 -0400
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
I agree, and I'm viewing this as blocking the taskstats merge. Because if
this _is_ a problem then it's a big one because fixing it will be
intrusive, and might well involve u
Andrew Morton wrote:
On Thu, 29 Jun 2006 09:44:08 -0700
Paul Jackson <[EMAIL PROTECTED]> wrote:
You're probably correct on that model. However, it all depends on the actual
workload. Are people who actually have large-CPU (>256) systems actually
running fork()-heavy things like webservers o
jamal wrote:
> Folks,
>
> Attached is a document that should help people wishing to use generic
> netlink interface. It is a WIP so a lot more to go if i see interest.
> The doc has been around for a while, i spent part of yesterday and this
> morning cleaning it up. If you have sent me comments b
jamal wrote:
> On Mon, 2006-19-06 at 11:13 -0400, James Morris wrote:
>
>
>>It seems that TIPC is multiplexing all of it's commands through
>>TIPC_GENL_CMD.
>
>
>
> TIPC is a deviation; they had the 100 ioctls and therefore did a direct
> one-to-one mapping.
>
>
>>I wonder, if this is how
genetlink-utils.patch
Two utilities for simplifying usage of NETLINK_GENERIC
interface.
Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
include/net/genetlink.h | 20
1 files changed, 20 insertions(+)
Index:
Balbir Singh wrote:
Hi, Andrew
On Wed, Mar 29, 2006 at 09:04:06PM -0800, Andrew Morton wrote:
Shailabh Nagar <[EMAIL PROTECTED]> wrote:
delayacct-genetlink.patch
Create a generic netlink interface (NETLINK_GENERIC family),
called "taskstats", for getting delay and c
delayacct-genetlink.patch
Create a generic netlink interface (NETLINK_GENERIC family),
called "taskstats", for getting delay and cpu statistics of
tasks and thread groups during their lifetime and when they exit.
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
Signed-off-
genetlink-utils.patch
Two utilities for simplifying usage of NETLINK_GENERIC
interface.
Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
include/net/genetlink.h | 20
1 files changed, 20 insertions(+)
Index:
Matt Helsley wrote:
On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote:
Comments addressed (all in response to Jamal)
- Eliminated TASKSTATS_CMD_LISTEN and TASKSTATS_CMD_IGNORE
The enums for these are still in the patch. See below.
+/*
+ * Commands sent from userspace
jamal wrote:
On Mon, 2006-13-03 at 18:33 -0800, Matt Helsley wrote:
On Mon, 2006-03-13 at 19:56 -0500, Shailabh Nagar wrote:
Jamal, was your Mon, 13 Mar 2006 21:29:09 -0500 reply:
Note, you are still not following the standard scheme of doing things.
Example: using command
is versioned and the command interface easily extensible to
facilitate reuse.
If reuse is not deemed useful enough, the naming, placement of functions
and config options will be modified to make this an interface for delay
accounting alone.
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
S
genetlink-utils.patch
Two utilities for simplifying usage of NETLINK_GENERIC
interface.
Signed-off-by: Balbir Singh <[EMAIL PROTECTED]>
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
include/net/genetlink.h | 20
1 files changed, 20 insertions(+)
Index:
events generated due
to task exit.
4. The taskstats and taskstats_reply structures are now 64 bit aligned.
5. Family id is dynamically generated.
Please let us know if we missed something out.
Thanks,
Balbir
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
Signed-off-by: Balbir Singh &
jamal wrote:
On Mon, 2006-06-03 at 12:00 -0500, Shailabh Nagar wrote:
My design was to have the listener get both responses (what I call
replies in the code) as well as events (data sent on exit of pid)
I think i may not be doing justice explaining this, so let me be more
Jamal,
Pls keep lkml and lse-tech on cc since some of this affects the usage
of delay accounting.
jamal wrote:
Hi Shailabh,
Apologies for taking a week to respond ..
On Mon, 2006-27-02 at 15:26 -0500, Shailabh Nagar wrote:
jamal wrote:
Yes, the current intent is to allow
jamal wrote:
+
+/*
+ * Commands sent from userspace
+ * Not versioned. New commands should only be inserted at the enum's end
+ */
+
+enum {
+ TASKSTATS_CMD_UNSPEC, /* Reserved */
+ TASKSTATS_CMD_NONE, /* Not a valid cmd to send
+
jamal wrote:
On Mon, 2006-27-02 at 03:31 -0500, Shailabh Nagar wrote:
+#define TASKSTATS_LISTEN_GROUP 0x1
You do multicast to this group - does this mean there could be multiple
listeners subscribed for this event?
Yes, the current intent is to allow multiple listeners to receive
and interface easily extensible to
facilitate reuse.
If reuse is not deemed useful enough, the naming, placement of functions
and config options will be modified to make this an interface for delay
accounting alone.
Signed-off-by: Shailabh Nagar <[EMAIL PROTECTED]>
Signed-off-by: Balb
47 matches
Mail list logo