On Tue, Aug 29, 2017 at 09:38:16PM -0600, David Ahern wrote: > On 8/29/17 8:58 PM, Alexei Starovoitov wrote: > > On Tue, Aug 29, 2017 at 07:03:43PM -0600, David Ahern wrote: > >> On 8/28/17 10:11 PM, Alexei Starovoitov wrote: > >>> > >>> Agree on the above, but you're mixing semantics of the new recurse > >>> flag and implementation of it. Ex: we don't have to copy this flag > >>> from prog->attr into cgroup. So this reset or non-reset discussion > >>> only makes sense in the context of your current implementation. > >>> We can implement the logic differently. Like don't copy that flag > >>> at all and at attach time walk parent->parent->parent and see > >>> what programs are attached. All of them should have prog->attr & > >>> recurse_bit set > >>> In such implementation detach from 'b' is a nop from reset/non-reset > >>> point of view. When socket creation in 'c' is invoked the program > >>> 'c' is called first then the code keeps walking parents until root > >>> invoking 'a' along the way. > >> > >> So you are suggesting there is no recursive flag per cgroup? How do you > >> know you need to walk cgroups? How do you know when to stop running > >> programs? > > > > you're talking about implementation, right? > > My 'proposed' implemenation of walking from cgroup all the way to the root > > is just an example. It's not efficient. More below... > > > >>> I'm not saying it will be an efficient implementation. The point > >>> is to discuss UAPI independent of implementation. > >>> > >>>> ### > >>>> > >>>> Also, let's agree on this intention. Based on the new ground rule, I > >>>> want to point out this example: > >>>> > >>>> If 'a' gets a program installed with no recurse flag set, ONLY processes > >>>> in 'a' have the 'a' program run. Processes in groups 'b', 'c' and 'd' > >>>> all stop at cgroup 'b' program. > >>> > >>> I'm proposing that such situation should not be allowed to happen. > >>> In a->b->c->d cgroup scenario if override+recurse prog attached to 'b' > >>> then only the same override+recurse can be attached to c, d, a. > >>> So at detach time there can be gaps (like only 'b' and 'd' have > >>> override+recurse progs), but walking up until root from any point > >>> will guarantee that only override+recurse programs are seen. > >>> > >> > >> That seems very limiting to me. Seems like you are suggesting the entire > >> cgroup tree is recursive or non-recursive, but never a mix. > > > > Entire cgroup subtree. Yes. It's the simplest uapi I could think of. > > So 10 email exchanges later you agree on the UAPI I implemented in this > patch: user opts in to recursive behavior via a new flag at attach time, > and once a recursive program is installed at some point in the cgroup > tree it applies to all descendant cgroups. > > So all of these exchanges weren't about the UAPI, but your disagreement > in my implementation. The only user visible change here is only programs > marked recursive are run versus going back to the first cgroup marked > non-recursive.
you cannot be serious. Your implemention is not at all what i'm proposing above as a simplest uapi. Should we all go back and re-read from the beginning? > > Easy to understand and argue about and I think it's solving your use case. > > It's also easily extendable. New combination and features won't break > > the users. It feels you're in rush to get this stuff for this merge > > window, therefore I want to agree on something that is simple, > > non-controversial and extensible. > > I am in no-rush, but this does not to fall by the wayside like the net > namespace specification. It's more than that! I think you only looking at it from vrf perspective whereas cgroup-bpf became a corner stone feature and enabler for tcp-bpf which in turn became a stepping stone for bpf_sk_redirect. So no, I'm not going to let something half baked that touches the core idea of cgroup-bpf slide in. Tejun and Andy also need to take a look, so yes it will take long time for everyone to agree on this core uapi. > Given the pending release of 4.13 net-next will close which gives a 2+ > week window to work on v3 before the next merge window. Plenty of time > for me to work it into the many other things on my plate. As I proposed several emails ago, please repost patches 2 and 3 that I already acked and we can continue discussing this patch without the agitation.