Re: [lxc-devel] namespaces and lxc
Hello, I read your posts, thanks a lot for the good and detailed info and examples. >When a new user namespace is created, the task populating it starts >as userid -1, nobody. I don't understand something: why nobody is userid -1 ? On fedora 18 we have: cat /etc/passwd | grep nobody nobody:x:99:99:Nobody:/:/sbin/nologin nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin Now it seems to me that userid of nobody is 99 here, according to this doc about /etc/passwd format: http://www.cyberciti.biz/faq/understanding-etcpasswd-file-format/ regards, Andy On Fri, Apr 19, 2013 at 5:54 PM, Andy Johnson wrote: > Hello, > Thanks a lot for your very detailed answer and quick response! > > Best, > Andy > > > > On Fri, Apr 19, 2013 at 5:18 PM, Serge Hallyn wrote: > >> Quoting Andy Johnson (johnson...@gmail.com): >> > Hello, >> > >> > Question about namespaces and lxc: >> > >> > I see that there is a tool named lxc-unshare, which is (according to >> > https://help.ubuntu.com/12.04/serverguide/lxc.html) for >> > testing and in fact calls the clone() syscall (via lxc_clone()) >> > and not via the unshare() syscall. >> >> lxc-unshare will be deprecated soon, as there is a 'unshare' command >> in util-linux. >> >> > While looking in the code for namespaces usage, I saw that in >> > lxc_attach_to_ns() >> > there is a call to setns(). But I am not sure as to whether this is >> used. >> >> clone and unshare create new namespaces. setns() attaches to an >> existing namespace. >> >> > Usage of cgroups in lxc is known. >> > >> > Regarding namesapces: does lxc support all six namesapaces ? are there >> > examples >> > of *.conf file/links for using namespaces ? >> >> All namespaces are used. uts, pid, ipc and mounts are always unshared. >> netns is not unshared if you don't specify any 'lxc.network.type' in >> your .conf. user is not unshared if you don't list any lxc.id_map >> entries. Both are described in the lxc.conf(5) man page. >> >> > is there support for user >> > namespace ? >> >> Very basic support - for creating a mapped user namespace when starting >> as the root user - is there. More advanced support for user namespace >> is in the works. In particular we want unprivileged users to be able >> to create and start containers in user namespaces, but there is work >> left to be done. >> >> >> http://s3hh.wordpress.com/2012/10/31/full-ubuntu-container-confined-in-a-user-namespace/ >> http://s3hh.wordpress.com/2013/03/07/experimenting-with-user-namespaces/ >> http://s3hh.wordpress.com/2013/02/12/user-namespaces-lxc-meeting/ >> >> The last link in particular leads to some discussion of where we want >> to go and what's left to do. >> >> -serge >> > > -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] make lxc_af_unix_open() safely return error on long pathnames
Signed-off-by: Dwight Engen --- src/lxc/af_unix.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/lxc/af_unix.c b/src/lxc/af_unix.c index eff13d4..45fe128 100644 --- a/src/lxc/af_unix.c +++ b/src/lxc/af_unix.c @@ -36,6 +36,7 @@ lxc_log_define(lxc_af_unix, lxc); int lxc_af_unix_open(const char *path, int type, int flags) { int fd; + size_t len; struct sockaddr_un addr; if (flags & O_TRUNC) @@ -52,8 +53,16 @@ int lxc_af_unix_open(const char *path, int type, int flags) addr.sun_family = AF_UNIX; /* copy entire buffer in case of abstract socket */ - memcpy(addr.sun_path, path, - path[0]?strlen(path):sizeof(addr.sun_path)); + len = sizeof(addr.sun_path); + if (path[0]) { + len = strlen(path); + if (len >= sizeof(addr.sun_path)) { + close(fd); + errno = ENAMETOOLONG; + return -1; + } + } + memcpy(addr.sun_path, path, len); if (bind(fd, (struct sockaddr *)&addr, sizeof(addr))) { int tmp = errno; @@ -61,7 +70,7 @@ int lxc_af_unix_open(const char *path, int type, int flags) errno = tmp; return -1; } - + if (type == SOCK_STREAM && listen(fd, 100)) { int tmp = errno; close(fd); @@ -76,7 +85,7 @@ int lxc_af_unix_close(int fd) { struct sockaddr_un addr; socklen_t addrlen = sizeof(addr); - + if (!getsockname(fd, (struct sockaddr *)&addr, &addrlen) && addr.sun_path[0]) unlink(addr.sun_path); -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] namespaces and lxc
Quoting Andy Johnson (johnson...@gmail.com): > Hello, > I read your posts, thanks a lot for the good and detailed info and examples. > > >When a new user namespace is created, the task populating it starts >as > userid -1, nobody. > > I don't understand something: why nobody is userid -1 ? > On fedora 18 we have: > cat /etc/passwd | grep nobody > nobody:x:99:99:Nobody:/:/sbin/nologin > nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin > > Now it seems to me that userid of nobody is 99 here, according to this doc > about /etc/passwd format: > http://www.cyberciti.biz/faq/understanding-etcpasswd-file-format/ The name doesn't matter. The number matters. The number is known in the kernel. The name is purely up to userspace. -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] make lxc_af_unix_open() safely return error on long pathnames
Quoting Dwight Engen (dwight.en...@oracle.com): > Signed-off-by: Dwight Engen Thanks. Acked-by: Serge E. Hallyn > --- > src/lxc/af_unix.c | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/src/lxc/af_unix.c b/src/lxc/af_unix.c > index eff13d4..45fe128 100644 > --- a/src/lxc/af_unix.c > +++ b/src/lxc/af_unix.c > @@ -36,6 +36,7 @@ lxc_log_define(lxc_af_unix, lxc); > int lxc_af_unix_open(const char *path, int type, int flags) > { > int fd; > + size_t len; > struct sockaddr_un addr; > > if (flags & O_TRUNC) > @@ -52,8 +53,16 @@ int lxc_af_unix_open(const char *path, int type, int flags) > > addr.sun_family = AF_UNIX; > /* copy entire buffer in case of abstract socket */ > - memcpy(addr.sun_path, path, > -path[0]?strlen(path):sizeof(addr.sun_path)); > + len = sizeof(addr.sun_path); > + if (path[0]) { > + len = strlen(path); > + if (len >= sizeof(addr.sun_path)) { > + close(fd); > + errno = ENAMETOOLONG; > + return -1; > + } > + } > + memcpy(addr.sun_path, path, len); > > if (bind(fd, (struct sockaddr *)&addr, sizeof(addr))) { > int tmp = errno; > @@ -61,7 +70,7 @@ int lxc_af_unix_open(const char *path, int type, int flags) > errno = tmp; > return -1; > } > - > + > if (type == SOCK_STREAM && listen(fd, 100)) { > int tmp = errno; > close(fd); > @@ -76,7 +85,7 @@ int lxc_af_unix_close(int fd) > { > struct sockaddr_un addr; > socklen_t addrlen = sizeof(addr); > - > + > if (!getsockname(fd, (struct sockaddr *)&addr, &addrlen) && > addr.sun_path[0]) > unlink(addr.sun_path); > -- > 1.8.1.4 > > > -- > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > ___ > Lxc-devel mailing list > Lxc-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/lxc-devel -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] lxc-start from git hangs in init
Quoting Kevin Wilson (wkev...@gmail.com): > Hello, > I just want to add, following a different thread I read here, that: > > lxc-execute -n CN -f /etc/lxc/lxc.conf -- ps -ef > seems ok, it gives: > UIDPID PPID C STIME TTY TIME CMD > root 1 0 0 05:39 pts/100:00:00 > /usr/local/libexec/lxc/lxc-init > root 2 1 0 05:39 pts/100:00:00 ps -ef Actually no - lxc-execute hanging is independent of yoru lxc-start hanging on plymouthd. Please tell us the exact host version, and exact commands you used to create and start the container. Start the contaienr with sudo lxc-start -n CN -l info -o o1 and send us o1. -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH] goto correct cleanup label to ensure fd is closed
Signed-off-by: Dwight Engen --- src/lxc/start.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lxc/start.c b/src/lxc/start.c index aefccd6..0a0cc40 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -434,10 +434,10 @@ struct lxc_handler *lxc_init(const char *name, struct lxc_conf *conf, const char goto out_close_maincmd_fd; } - /* Begin the set the state to STARTING*/ + /* Begin by setting the state to STARTING */ if (lxc_set_state(name, handler, STARTING)) { ERROR("failed to set state '%s'", lxc_state2str(STARTING)); - goto out_free_name; + goto out_close_maincmd_fd; } /* Start of environment variable setup for hooks */ -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] goto correct cleanup label to ensure fd is closed
Quoting Dwight Engen (dwight.en...@oracle.com): > Signed-off-by: Dwight Engen Acked-by: Serge E. Hallyn > --- > src/lxc/start.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/lxc/start.c b/src/lxc/start.c > index aefccd6..0a0cc40 100644 > --- a/src/lxc/start.c > +++ b/src/lxc/start.c > @@ -434,10 +434,10 @@ struct lxc_handler *lxc_init(const char *name, struct > lxc_conf *conf, const char > goto out_close_maincmd_fd; > } > > - /* Begin the set the state to STARTING*/ > + /* Begin by setting the state to STARTING */ > if (lxc_set_state(name, handler, STARTING)) { > ERROR("failed to set state '%s'", lxc_state2str(STARTING)); > - goto out_free_name; > + goto out_close_maincmd_fd; > } > > /* Start of environment variable setup for hooks */ > -- > 1.8.1.4 > > > -- > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > ___ > Lxc-devel mailing list > Lxc-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/lxc-devel -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [RFC PATCH v2] allow multiple monitor clients
[...] > > > So here is what I'm proposing: when lxc-monitor starts, it > > > attempts to start lxc-monitord. lxc-monitord creates a fifo and a > > > socket on lxcpath. > > > > Thanks, Dwight. Looks awesome. Some comments below, but I'm only > > not adding an ack bc you say you want to make some changes first > > anyway. > > > > When you send the next version I'll run it through my testsuite (and > > presumably ack). Serge, this one should be testable so feel free to hit it with your testsuite if you want, but I'm still sending it as RFC due to the questions raised below. Caglar, I've tested this pretty thoroughly with your python parallel start script doing 40 containers parallel 4 ways (and put into a loop running for a few minutes) and some other parallel startup cases I made. If you want to test it yourself too, that'd be great. > > ... > > > +static void lxc_monitord_delete(struct lxc_monitor *mon, const > > > char *lxcpath) +{ > > > + int i; > > > + > > > + lxc_mainloop_del_handler(&mon->descr, mon->listenfd); > > > + close(mon->listenfd); > > > > The ordering here might need to change to ensure we don't get any > > clients hanging between the two steps. > > Hmm, good point. Have to think about this, there might be a race the > other way with still having the fd in the epoll set also. I think the way I have it is the right order. It looks to me from the epoll_ctl manpage that we'll get EBADF if fd isn't valid at the time of EPOLL_CTL_DEL. I have a couple of new questions too :( In the current code there is the following flow (only when starting a container as daemon through the api): lxcapi_start() wait_on_daemonized_start() lxcapi_wait() lxc_wait() lxc_monitor_open() This is racy with multiple starters all trying to bind the socket address and is the source of Caglar's original problem. It looks to me like it is also racy with respect to the child container setting the state, but there is a timeout safety on the wait side so we don't hang. In the change I'm proposing, I put in a call to lxc_monitord_spawn() in only this daemon case which should mean there is always a place to deliver the status and no races between parent and child (because of sync'ing on the pipe), but has the drawback of having monitord always get started when the container is started daemonized. This is the only flow I found that used lxc_wait internally. Stephane, you might want to comment on this as it looks like you added the wait_on_daemonized_start() stuff, I'd be happy if my analysis missed something :) At least monitord will go away after startup, but it'd be nice not to have to start it at all. Any thoughts? The other question is: I do getsockopt SO_PEERCRED and check against the effective uid which I believe is correct in case the caller is setuid. What I'm wondering is why the routines in af_unix.c are passing/checking against the real? -- Signed-off-by: Dwight Engen --- .gitignore | 1 + src/lxc/Makefile.am| 2 + src/lxc/lxc_console.c | 4 +- src/lxc/lxc_monitor.c | 2 + src/lxc/lxc_monitord.c | 377 + src/lxc/lxccontainer.c | 6 +- src/lxc/mainloop.c | 7 +- src/lxc/mainloop.h | 7 +- src/lxc/monitor.c | 192 ++--- src/lxc/monitor.h | 10 +- src/lxc/start.c| 4 +- src/lxc/utils.h| 26 12 files changed, 574 insertions(+), 64 deletions(-) create mode 100644 src/lxc/lxc_monitord.c diff --git a/.gitignore b/.gitignore index 905a2dc..c614a75 100644 --- a/.gitignore +++ b/.gitignore @@ -52,6 +52,7 @@ src/lxc/lxc-info src/lxc/lxc-init src/lxc/lxc-kill src/lxc/lxc-monitor +src/lxc/lxc-monitord src/lxc/lxc-netstat src/lxc/lxc-ps src/lxc/lxc-restart diff --git a/src/lxc/Makefile.am b/src/lxc/Makefile.am index ebeca466..1fa0fa8 100644 --- a/src/lxc/Makefile.am +++ b/src/lxc/Makefile.am @@ -150,6 +150,7 @@ bin_PROGRAMS = \ lxc-start \ lxc-execute \ lxc-monitor \ + lxc-monitord \ lxc-wait \ lxc-console \ lxc-freeze \ @@ -181,6 +182,7 @@ lxc_freeze_SOURCES = lxc_freeze.c lxc_info_SOURCES = lxc_info.c lxc_init_SOURCES = lxc_init.c lxc_monitor_SOURCES = lxc_monitor.c +lxc_monitord_SOURCES = lxc_monitord.c lxc_restart_SOURCES = lxc_restart.c lxc_start_SOURCES = lxc_start.c lxc_stop_SOURCES = lxc_stop.c diff --git a/src/lxc/lxc_console.c b/src/lxc/lxc_console.c index 643c442..f6659f6 100644 --- a/src/lxc/lxc_console.c +++ b/src/lxc/lxc_console.c @@ -241,7 +241,7 @@ Type to exit the console, \ goto out_mainloop_open; } - err = lxc_mainloop(&descr); + err = lxc_mainloop(&descr, -1); if (err) { ERROR("mainloop returned an error"); goto out_mainloop_open; @@ -255,7 +255,7 @@ out_mainloop_open: out: /* Restore previous terminal parameter */ tcsetattr(0, TCSAFLUSH, &oldtios); - + /* Retu
Re: [lxc-devel] [PATCH 1/1] lxc-create: add zfs support
Quoting Stéphane Graber (stgra...@ubuntu.com): > I'm not a big user of lxc-clone (yet) but I think as we redesign that > part of the code, consistency across backend should be a primary goal > even if that causes some slight changes in behaviour from previous > implementations. Ok it finally dawned on me that if we want consistency, then we can't do it the new way. LVM backed containers are the primary counter example. To have $lxc_path/$lxc_name be lvm-backed would require the LV to be mounted at host boot. So what I'm going to do is implement a few more backing stores and a tiny temporary toy lxc-clone c program for easier testing, with zfs for now switched to only its rootfs being a separate zfs unit. We can't decide to switch all backing stores to the other way, but if we decide to later support zfs and btrfs having $lxcpath/$lxcname be separate units/subvolumes, it shouldn't be a big deal to special case that and support both. But I'd rather not complicate what I'm doing with that now. -serge -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/2] Support stopping containers concurrently
Hey Stéphane, On Wed, Apr 17, 2013 at 6:06 PM, Serge Hallyn wrote: > Quoting S.Çağlar Onur (cag...@10ur.org): > > From: "S.Çağlar Onur" > > > > Trying to stop multiple containers concurrently ends up with "cgroup is > not mounted" errors as multiple threads corrupts the shared variables. > > Fix that stack corruption and start to use getmntent_r to support > stopping multiple containers concurrently. > > > > Signed-off-by: S.Çağlar Onur > > --- > > src/lxc/cgroup.c | 152 > +++-- > > src/lxc/freezer.c | 18 +-- > > src/lxc/state.c | 15 -- > > 3 files changed, 126 insertions(+), 59 deletions(-) > > > > diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c > > index 368214f..0739477 100644 > > --- a/src/lxc/cgroup.c > > +++ b/src/lxc/cgroup.c > > @@ -54,6 +54,11 @@ lxc_log_define(lxc_cgroup, lxc); > > > > #define MTAB "/proc/mounts" > > > > +/* In the case of a bind mount, there could be two long pathnames in the > > + * mntent plus options so use large enough buffer size > > + */ > > +#define LARGE_MAXPATHLEN 4 * MAXPATHLEN > > + > > /* Check if a mount is a cgroup hierarchy for any subsystem. > > * Return the first subsystem found (or NULL if none). > > */ > > @@ -100,29 +105,31 @@ static char *mount_has_subsystem(const struct > mntent *mntent) > > */ > > static int get_cgroup_mount(const char *subsystem, char *mnt) > > { > > - struct mntent *mntent; > > + struct mntent *mntent, mntent_r; > > FILE *file = NULL; > > int ret, err = -1; > > > > + char buf[LARGE_MAXPATHLEN] = {0}; > > Ah yes, this must be what I thought we were waiting on - a response > from Stéphane on this. > > I'm still worried about this stack usage, especially in something > which is rather commonly called. Stéphane, is this a non-issue > for arm? Have you had a chance to look at that? Should I keep them like this or start to allocate them via calloc? > -serge > Cheers, -- S.Çağlar Onur -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel