On Mon, Jan 21, 2013 at 09:18:38AM -0800, John Fastabend wrote:
> On 01/21/2013 01:57 AM, Li Zefan wrote:
> > On 2013/1/21 17:27, Daniel Wagner wrote:
> >> On 21.01.2013 10:01, Li Zefan wrote:
> >>> On 2013/1/21 16:50, Daniel Wagner wrote:
> >>>> Hi Li,
> >>>>
> >>>> On 21.01.2013 07:08, Li Zefan wrote:
> >>>>> I'm not a network developer, so correct me if I'm wrong.
> >>>>>
> >>>>> Since commit 7955490f732c2b8
> >>>>> ("net: netprio_cgroup: rework update socket logic"), 
> >>>>> sock->sk->sk_cgrp_prioidx
> >>>>> is set when the socket is created, and won't be updated unless the task 
> >>>>> is
> >>>>> moved to another cgroup.
> >>>>>
> >>>>> Now the problem is, a socket can be _shared_ by multiple processes 
> >>>>> (fork, SCM_RIGHT).
> >>>>> If we place those processes in different cgroups, and each cgroup has
> >>>>> different configs, but all of the processes will send data via this 
> >>>>> socket
> >>>>> with the same network priority.
> >>>>
> >>>> Wouldn't that be addressed by 48a87cc26c13b68f6cce4e9d769fcb17a6b3e4b8
> >>>>
> >>>>      net: netprio: fd passed in SCM_RIGHTS datagram not set correctly
> >>>>
> >>>>       A socket fd passed in a SCM_RIGHTS datagram was not getting
> >>>>       updated with the new tasks cgrp prioidx. This leaves IO on
> >>>>       the socket tagged with the old tasks priority.
> >>>>
> >>>>       To fix this add a check in the scm recvmsg path to update the
> >>>>       sock cgrp prioidx with the new tasks value.
> >>>>
> >>>> As I read this this should work for net_prio.
> >>>>
> >>>
> >>> But after process A passed the socket fd to B, both A and B can use the
> >>> same socket to send data, right? Then if A and B were placed in different
> >>> cgroups with differnt configs, A's config won't take effect anymore.
> >>>
> >>> Am I missing something?
> >>
> >> I don't know. I guess at one point the socket resources are shared and then
> >> one configuration is taking preference. As you can see I am far away of
> >> being
> >> an expert in this field. Hopefully someone who understands this bits
> >> can chip in.
> >>
> >> BTW, isn't this a similar to what should happen with the block io cgroup?
> >> What is the behavior with a fd writing to a file in the scenario you
> >> describe above?
> >>
> > 
> > It forbids task moving in this case:
> > 
> > /*
> >   * We cannot support shared io contexts, as we have no mean to support
> >   * two tasks with the same ioc in two different groups without major rework
> >   * of the main cic data structures.  For now we allow a task to change
> >   * its cgroup only if it's the only owner of its ioc.
> >   */
> > static int blkcg_can_attach(struct cgroup *cgrp, struct cgroup_taskset 
> > *tset)
> > 
> 
> OK, I guess we should do something similar in the netprio, netcls
> cgroups and
> yes document it as you noted in your last comment.

Here is my attempt to add such a check. I really don't know if this is the
correct way to do so. To test this I have written a test program, which
seems to test the right thing. Please have a look and let me know if
it is correct: http://www.monom.org/misc/scm_rights.c

And here a dirty first version of the patch:


>From 49a78d907eaf31c16673025e7e3b4844e419e416 Mon Sep 17 00:00:00 2001
From: Daniel Wagner <daniel.wag...@bmw-carit.de>
Date: Tue, 22 Jan 2013 11:08:22 +0100
Subject: [PATCH] net: net_prio: Block attach if a socket is shared

---
 net/core/netprio_cgroup.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 847c02b..de4e6c5 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -274,9 +274,39 @@ static struct cftype ss_files[] = {
        { }     /* terminate */
 };
 
+static int check_cnt(const void *v, struct file *file, unsigned n)
+{
+       unsigned *flag = (unsigned *)v;
+       int err;
+
+       struct socket *sock = sock_from_file(file, &err);
+       if (sock && file_count(file) > 1)
+               *flag = 1;
+
+       return 0;
+}
+
+static int cgrp_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)
+{
+       struct task_struct *p;
+       unsigned flag = 0;
+
+       cgroup_taskset_for_each(p, cgrp, tset) {
+               task_lock(p);
+               iterate_fd(p->files, 0, check_cnt, &flag);
+               task_unlock(p);
+       }
+
+       if (flag != 0)
+               return -EINVAL;
+
+       return 0;
+}
+
 struct cgroup_subsys net_prio_subsys = {
        .name           = "net_prio",
        .create         = cgrp_create,
+       .can_attach     = cgrp_can_attach,
        .destroy        = cgrp_destroy,
        .attach         = net_prio_attach,
        .subsys_id      = net_prio_subsys_id,
-- 
1.8.0.rc0



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to