Alun Evans <a...@badgerous.net> writes: > Hi Eric, > > > On Tue, 19 Feb 2019, Eric W. Biederman <ebied...@xmission.com> wrote: >> >> David Howells <dhowe...@redhat.com> writes: >> >> > Provide a system call to open a socket inside of a container, using that >> > container's network namespace. This allows netlink to be used to manage >> > the container. >> > >> > fd = container_socket(int container_fd, >> > int domain, int type, int protocol); >> > >> >> Nacked-by: "Eric W. Biederman" <ebied...@xmission.com> >> >> Use a namespace file descriptor if you need this. So far we have not >> added this system call as it is just a performance optimization. And it >> has been too niche to matter. >> >> If this that has changed we can add this separately from everything else >> you are doing here. > > I think I've found the niche. > > > I'm trying to use network namespaces from Go.
Yes. Go sucks for this. > Since setns is thread > specific, I'm forced to use this pattern: > > runtime.LockOSThread() > defer runtime.UnlockOSThread() > … > err = netns.Set(newns) > > > This is only safe recently: > https://github.com/vishvananda/netns/issues/17#issuecomment-367325770 > > - but is still less than ideal performance wise, as it locks out other > socket operations. > > The socketat() / socketns() would be ideal: > > https://lwn.net/Articles/406684/ > https://lwn.net/Articles/407495/ > https://lkml.org/lkml/2011/10/3/220 > > > One thing that is interesting, the LockOSThread works pretty well for > receiving, since I can wrap it around the socket()/bind()/listen() at > startup. Then accept() can run outside of the lock. > > It's creating new outbound tcp connections via socket()/connect() pairs > that is the issue. As I understand it you should be able to write socketat in go something like: runtime.LockOSThread() err = netns.Set(newns); fd = socket(...); err = netns.Set(defaultns); runtime.UnlockOSThread() I have no real objections to a kernel system call doing that. It has just never risen to the level where it was necessary to optimize userspace yet. Eric