Re: [lxc-devel] [Spam-Wahrscheinlichkeit=99]Re: [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Jäkel , Guido
Dear Christian, Stéphane and Serge,

>If we want to have a back-channel, we'd need a socket, ...
> [...]
>On the other hand, it wouldn't be too complicated to have two special files 
>lying around: ...

I'm very pleased about the discussion and efforts to implement such a feature 
because I already have asked for it in former times. In the one hand, this fifo 
approach may be used for more than the current task. But in the other hand, 
it's seems to need a bunch of dependencies.


I wonder, if there might be another way. I'm not familiar with his things, but 
I watched from the logs that the lxc-start process get noticed of some signals 
the container init process. In addition, something called lxc_utmp get events 
via the inotify framework.

What's about using an signal (e.g. SIGUSR1) that some simple kill may send to 
the containers init process? Is it possible to hook this by the lxc-start?

Or might be a way offered by watching for the event of creation or modification 
of some file by a simple touch command, which may be called by an init script 
in a final phase? On my Gentoo environment a  'touch /var/run/foo' inside a 
container will cause an lxc (debug) log entry like

  lxc-start 1347607452.195 DEBUGlxc_utmp - got inotify event 256 for foo



Greetings

Guido

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Daniel P. Berrange
On Thu, Sep 13, 2012 at 11:26:39PM +0200, Christian Seiler wrote:
> > I like the idea but haven't looked at the implementation yet as the
> > patch is really quite large. Quickly scanning through I briefly noticed
> > that the copyright headers for the new files are wrong (refer to IBM and
> > Daniel instead of Christian).
> 
> I just copy&pasted them from the other files, most header files I saw
> contained the same copyright. Just tell me what exactly to put there and
> then I'll do that for the next version of the patch.
> 
> > I'm also wondering if we shouldn't try to keep the "protocol" a bit more
> > generic to eventually allow the container to send/receive more than just
> > its status?
> 
> If we want to have a back-channel, we'd need a socket, which makes just
> doing echo RUNNING > /dev/lxc-notify impossible, you'd need a special
> program for that. Having the template scripts dump an additional script
> or upstart job or systemd unit file or whatever in the container when
> creating it seems a lot easier than having to use a special program.

FYI, the systemd team actually want to be able to expose a full socket
from the container to the host, so that the host systemd/systemctl cmd
can directly communicate with the container's systemd. So I don't think
that /dev/lxc-notify would be useful for systemd.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Christian Seiler
>> If we want to have a back-channel, we'd need a socket, which makes 
>> just
>> doing echo RUNNING > /dev/lxc-notify impossible, you'd need a 
>> special
>> program for that. Having the template scripts dump an additional 
>> script
>> or upstart job or systemd unit file or whatever in the container 
>> when
>> creating it seems a lot easier than having to use a special program.
>
> FYI, the systemd team actually want to be able to expose a full 
> socket
> from the container to the host, so that the host systemd/systemctl 
> cmd
> can directly communicate with the container's systemd. So I don't 
> think
> that /dev/lxc-notify would be useful for systemd.

First of all, you have to separate two things - I mentioned systemd
here in the sense that when the system reaches default.target,
/dev/lxc-notify should be pinged so that the lxc state now changes from
BOOTING to RUNNING. What you are talking about is a systemctl on the
outside of the container affecting the inside. I wanted to solve the
first problem, where /dev/lxc-notify is useful anyway, with or without
systemd.

The use case you are describing is a bit more complicated. You want to
expose a socket outside the container that is listened to by a program
inside the container. The problem here is that if you want to
bind-mount it before the pivot_root call, this will not work since
bind() for a socket will fail if the file already exists. But as soon
as you are already in the container, if systemd actually does listen to
a socket somewhere, you'll have a hard time bind-mounting it back to
the outside, How do you bridge the mount namespace? Obviously, if the
container's filesystem is mounted on the host anyway, you don't have a
problem, since you don't need to take care about the namespace; but
what if the lxc config specifies a block device that is then only
mounted inside the container's namespace?

That being said, if we actually implement /dev/lxc-notify (or however
one wants to call it, perhaps /run/lxc-host-interface?) as a socket
with an extensible protocol, it would be possible not only to have a
command that tells lxc to open a socket on the host and pass the fd
back through the connection, then systemd on the inside would be in
posession of a socket that listens on the outside and that an outside
systemctl could affect. So my proposal with the modifications suggested
by Stéphane would actually be able also solve your use case.

However, first I'd like to have the basic version just for status
updates (because that is a useful feature anyway, independently of the
init system) in order to keep it simple - and once that is done, one
may think about how/whether to extend this to include other use cases
that are more specialized.


--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Christian Seiler
> I'm very pleased about the discussion and efforts to implement such a
> feature because I already have asked for it in former times. In the
> one hand, this fifo approach may be used for more than the current
> task. But in the other hand, it's seems to need a bunch of
> dependencies.

You mean that you have to modify the container? Yes. sure, but the
modification is rather trivial - one should just ping the notification
FIFO/socket/whatever at boot (on systems with LSB init in a rc.local or
similar script for example) to notify lxc that the container is booted.

What you seem to want is a way for lxc to detect that automatically
without any intervention from the container. I don't think that's
possible in any kind of way that is not a complete and utter hack,
since only the container itself can have any concept of whether it's
done with booting or not. Of course, if you assume that you have
sysvinit and LSB scripts, you can check whether the command line of
init in /proc is now "init [3]" and that /etc/init.d/rc 3 (or however
that is called on your distro) has finished running inside the
container and the process doesn't exist anymore. But that doesn't take
into account upstart or systemd or any other kind of init system.

If you can guarantee a certain environment, you could probably hack
something together along the above lines, but I personally don't think
that it would be a good idea for the much more general lxc code to
include some hack like this.

Christian


--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Daniel P. Berrange
On Fri, Sep 14, 2012 at 12:12:57PM +0100, Christian Seiler wrote:
> >>If we want to have a back-channel, we'd need a socket, which
> >>makes just
> >>doing echo RUNNING > /dev/lxc-notify impossible, you'd need a
> >>special
> >>program for that. Having the template scripts dump an additional
> >>script
> >>or upstart job or systemd unit file or whatever in the container
> >>when
> >>creating it seems a lot easier than having to use a special program.
> >
> >FYI, the systemd team actually want to be able to expose a full
> >socket
> >from the container to the host, so that the host systemd/systemctl
> >cmd
> >can directly communicate with the container's systemd. So I don't
> >think
> >that /dev/lxc-notify would be useful for systemd.
> 
> First of all, you have to separate two things - I mentioned systemd
> here in the sense that when the system reaches default.target,
> /dev/lxc-notify should be pinged so that the lxc state now changes from
> BOOTING to RUNNING. What you are talking about is a systemctl on the
> outside of the container affecting the inside. I wanted to solve the
> first problem, where /dev/lxc-notify is useful anyway, with or without
> systemd.

Actually what I was anticipating that it work both ways - SystemD
uses DBus over its control socket, which would allow for both RPC
calls from host into the container, and signals from the container
to be emitted upon service status change and received by the host.

> The use case you are describing is a bit more complicated. You want to
> expose a socket outside the container that is listened to by a program
> inside the container. The problem here is that if you want to
> bind-mount it before the pivot_root call, this will not work since
> bind() for a socket will fail if the file already exists. But as soon
> as you are already in the container, if systemd actually does listen to
> a socket somewhere, you'll have a hard time bind-mounting it back to
> the outside, How do you bridge the mount namespace? Obviously, if the
> container's filesystem is mounted on the host anyway, you don't have a
> problem, since you don't need to take care about the namespace; but
> what if the lxc config specifies a block device that is then only
> mounted inside the container's namespace?

I must admit the details aren't worked out, but the rough idea was
something like the following. On the host have a directory per
container, in which the socket is setup

   /var/lib/systemd/container/

And bind '/var/lib/systemd/containerXXX' into the container in some
location, lets say '/var/lib/systemd/self/'. The idea is that if
systemd in the container now listens on /var/lib/systemd/self/systemd.sock
that a process in the host can connect via

  /var/lib/systemd/container/systemd.sock

I'm a little fuzzy on exactly how UNIX domain socket paths interact
wrt mount namespaces though, so this idea may not actually work in
practice - I'm yet to try it.

Another option is actually to ignore the filesystem and have systemd
in the host simply pass in a pre-opened file descriptor when creating
the container, which systemd in the container can just inherit and
use.

> That being said, if we actually implement /dev/lxc-notify (or however
> one wants to call it, perhaps /run/lxc-host-interface?) as a socket
> with an extensible protocol, it would be possible not only to have a
> command that tells lxc to open a socket on the host and pass the fd
> back through the connection, then systemd on the inside would be in
> posession of a socket that listens on the outside and that an outside
> systemctl could affect. So my proposal with the modifications suggested
> by Stéphane would actually be able also solve your use case.

The socket passing idea you describe is an interesting idea.

> However, first I'd like to have the basic version just for status
> updates (because that is a useful feature anyway, independently of the
> init system) in order to keep it simple - and once that is done, one
> may think about how/whether to extend this to include other use cases
> that are more specialized.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Christian Seiler
> I must admit the details aren't worked out, but the rough idea was
> something like the following. On the host have a directory per
> container, in which the socket is setup
>
>/var/lib/systemd/container/
>
> And bind '/var/lib/systemd/containerXXX' into the container in some
> location, lets say '/var/lib/systemd/self/'. The idea is that if
> systemd in the container now listens on 
> /var/lib/systemd/self/systemd.sock
> that a process in the host can connect via
>
>   /var/lib/systemd/container/systemd.sock

This you can already do in current lxc - just add an entry in the form

lxc.mount.entry = /var/lib/systemd/containerXXX var/lib/systemd/self 
none bind 0 0

to the lxc config file of your container. There's no need to change any
code for that. (You have to make sure both directories exist, however.)

OTOH, for the status updates I'm proposing, it's more LXC itself having
some form of indication as to whether the container is currently really
running, just booting or in the process of shutting down - that makes
lxc-info much more useful.

> I'm a little fuzzy on exactly how UNIX domain socket paths interact
> wrt mount namespaces

As long as you can see the socket, you can connect to it. If you
bind-mount a directory, any socket you create inside the container will
also appear on the host. What you can't do is just bind-mount a socket
itself, since it already has to exist, which means that you can't bind
to it and listen after that.

The only tricky thing are UNIX domain sockets in the abstract 
namespace,
i.e. the ones starting with a 0-byte in their name: They are tied to 
the
network namespace, so you can *never* see an abstract UNIX socket from
another namespace (unless you manage to pass around the fd in some 
way).
But for sockets which are tied to a real object in the filesystem, this
restriction doesn't apply.

By the way, as a side-note for your idea for systemctl working from the
outside: If you really want to isolate your container from the host,
then you have to make sure that in can't DOS the host by filling up
/var. This is not possible if you just bind-mount a socket/FIFO, but
that doesn't work for your use-case, so you probably would want to
mount a tmpfs with a *very* small quota to 
/var/lib/systemd/containerXXX
(in the pre-start lxc hook for example) and then bind-mount that 
instead
of part of a real file system that may be filled up.

Regards,
Christian


--
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

2012-09-14 Thread Serge Hallyn
Quoting Christian Seiler (christ...@iwakd.de):
> This patch adds a simple notification system that allows the container to
> notify the host (in particular, the lxc-start process) that the boot process
> has been completed successfully. It also adds an additional status BOOTING
> that lxc-info may return. This allows the administrator and scripts to
> distinguish between a fully-running container and a container that is still
> in the process of booting.
> 
> If nothing is added to the configuration file, the current behavior is not
> changed, i.e. after lxc-start finishes the initialization, the container is
> immediately put into the RUNNING state. This ensures backwards
> compatibility.
> 
> If lxc.notification.type is set to 'fifo', after lxc-start initialization
> the container is initially put into the state BOOTING. Also, the FIFO
> /var/lib/lxc/%s/notification-fifo is created and bind-mounted into the
> container, by default to /dev/lxc-notify, but this can be changed via the
> lxc.notification.path configuration setting.
> 
> Inside the container one may execute 'echo RUNNING > /dev/lxc-notify' or an
> equivalent command to notify lxc-start that the container has now booted.
> Similarly, 'echo STOPPING > /dev/lxc-notify' will change the status to
> STOPPING, which may be done on shutdown. Currently, only RUNNING and
> STOPPING are allowed, other states are ignored.
> 
> This patch only provides the LXC part for the notification system, the
> counterpart inside the container has to be provided separately. The
> interface has been kept extremely simple to facilitate this.
> 
> The choice of the option lxc.notification.type, as opposed to
> lxc.notification.enabled, is deliberate in order to make this extensible. If
> at some point there is some kind of standardized system for these types of
> notifications, it will be simple to just add a new value for the
> lxc.notification.type option.
> 
> Signed-off-by: Christian Seiler 
> Cc: Serge Hallyn 

The longish thread makes me think we should accept this patch, then see
whether and how to easily extend it.

A few comments below:

> Cc: Guido Jäkel 
> ---
>  src/lxc/Makefile.am|1 +
>  src/lxc/conf.c |8 +
>  src/lxc/conf.h |3 +
>  src/lxc/confile.c  |   34 +
>  src/lxc/notification.c |  349 
> 
>  src/lxc/notification.h |   50 +++
>  src/lxc/start.c|   22 +++-
>  src/lxc/start.h|1 +
>  src/lxc/state.c|1 +
>  src/lxc/state.h|3 +-
>  10 files changed, 468 insertions(+), 4 deletions(-)
>  create mode 100644 src/lxc/notification.c
>  create mode 100644 src/lxc/notification.h
> 
> diff --git a/src/lxc/Makefile.am b/src/lxc/Makefile.am
> index 7d86ad6..d976bf7 100644
> --- a/src/lxc/Makefile.am
> +++ b/src/lxc/Makefile.am
> @@ -32,6 +32,7 @@ liblxc_so_SOURCES = \
>   freezer.c \
>   checkpoint.c \
>   restart.c \
> + notification.h notification.c \
>   error.h error.c \
>   parse.c parse.h \
>   cgroup.c cgroup.h \
> diff --git a/src/lxc/conf.c b/src/lxc/conf.c
> index 1450ca6..422b742 100644
> --- a/src/lxc/conf.c
> +++ b/src/lxc/conf.c
> @@ -61,6 +61,7 @@
>  #include "log.h"
>  #include "lxc.h" /* for lxc_cgroup_set() */
>  #include "caps.h"   /* for lxc_caps_last_cap() */
> +#include "notification.h"
>  
>  #if HAVE_APPARMOR
>  #include 
> @@ -2253,6 +2254,11 @@ int lxc_setup(const char *name, struct lxc_conf 
> *lxc_conf)
>   return -1;
>   }
>  
> + if (lxc_notification_mount_hook(name, lxc_conf)) {
> + ERROR("failed to init notification mechanism for container 
> '%s'.", name);
> + return -1;
> + }
> +
>   if (setup_cgroup(name, &lxc_conf->cgroup)) {
>   ERROR("failed to setup the cgroups for '%s'", name);
>   return -1;
> @@ -2540,6 +2546,8 @@ void lxc_conf_free(struct lxc_conf *conf)
>   if (conf->aa_profile)
>   free(conf->aa_profile);
>  #endif
> + if (conf->notification_path)
> + free(conf->notification_path);
>   lxc_clear_config_caps(conf);
>   lxc_clear_cgroups(conf, "lxc.cgroup");
>   lxc_clear_hooks(conf);
> diff --git a/src/lxc/conf.h b/src/lxc/conf.h
> index dcf79fe..5ed67ec 100644
> --- a/src/lxc/conf.h
> +++ b/src/lxc/conf.h
> @@ -31,6 +31,7 @@
>  #include 
>  
>  #include  /* for lxc_handler */
> +#include  /* for notification types */
>  
>  enum {
>   LXC_NET_EMPTY,
> @@ -237,6 +238,8 @@ struct lxc_conf {
>  #endif
>   char *seccomp;  // filename with the seccomp rules
>   int maincmd_fd;
> + lxc_notification_type_t notification_type;
> + char *notification_path;
>  };
>  
>  int run_lxc_hooks(const char *name, char *hook, struct lxc_conf *conf);
> diff --git a/src/lxc/confile.c b/src/lxc/confile.c
> index 2d14e0f..f48b8c0 100644
> --- a/src/lxc/confile.c
> +++ b/src/lxc/confile.c
> @@ -53,6 +53,8 @@ static int config_ttydir(const c