On Sun, 19 Jun 2011 14:00:30 -0500 Michael Roth <mdr...@linux.vnet.ibm.com> wrote:
> On 06/17/2011 10:25 PM, Luiz Capitulino wrote: > > On Fri, 17 Jun 2011 16:25:32 -0500 > > Michael Roth<mdr...@linux.vnet.ibm.com> wrote: > > > >> On 06/17/2011 03:13 PM, Luiz Capitulino wrote: > >>> On Fri, 17 Jun 2011 14:21:31 -0500 > >>> Michael Roth<mdr...@linux.vnet.ibm.com> wrote: > >>> > >>>> On 06/16/2011 01:42 PM, Luiz Capitulino wrote: > >>>>> On Tue, 14 Jun 2011 15:06:22 -0500 > >>>>> Michael Roth<mdr...@linux.vnet.ibm.com> wrote: > >>>>> > >>>>>> This is the actual guest daemon, it listens for requests over a > >>>>>> virtio-serial/isa-serial/unix socket channel and routes them through > >>>>>> to dispatch routines, and writes the results back to the channel in > >>>>>> a manner similar to QMP. > >>>>>> > >>>>>> A shorthand invocation: > >>>>>> > >>>>>> qemu-ga -d > >>>>>> > >>>>>> Is equivalent to: > >>>>>> > >>>>>> qemu-ga -c virtio-serial -p > >>>>>> /dev/virtio-ports/org.qemu.guest_agent \ > >>>>>> -p /var/run/qemu-guest-agent.pid -d > >>>>>> > >>>>>> Signed-off-by: Michael Roth<mdr...@linux.vnet.ibm.com> > >>>>> > >>>>> Would be nice to have a more complete description, like explaining how > >>>>> to > >>>>> do a simple test. > >>>>> > >>>>> And this can't be built... > >>>>> > >>>>>> --- > >>>>>> qemu-ga.c | 631 > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> qga/guest-agent-core.h | 4 + > >>>>>> 2 files changed, 635 insertions(+), 0 deletions(-) > >>>>>> create mode 100644 qemu-ga.c > >>>>>> > >>>>>> diff --git a/qemu-ga.c b/qemu-ga.c > >>>>>> new file mode 100644 > >>>>>> index 0000000..df08d8c > >>>>>> --- /dev/null > >>>>>> +++ b/qemu-ga.c > >>>>>> @@ -0,0 +1,631 @@ > >>>>>> +/* > >>>>>> + * QEMU Guest Agent > >>>>>> + * > >>>>>> + * Copyright IBM Corp. 2011 > >>>>>> + * > >>>>>> + * Authors: > >>>>>> + * Adam Litke<agli...@linux.vnet.ibm.com> > >>>>>> + * Michael Roth<mdr...@linux.vnet.ibm.com> > >>>>>> + * > >>>>>> + * This work is licensed under the terms of the GNU GPL, version 2 or > >>>>>> later. > >>>>>> + * See the COPYING file in the top-level directory. > >>>>>> + */ > >>>>>> +#include<stdlib.h> > >>>>>> +#include<stdio.h> > >>>>>> +#include<stdbool.h> > >>>>>> +#include<glib.h> > >>>>>> +#include<gio/gio.h> > >>>>>> +#include<getopt.h> > >>>>>> +#include<termios.h> > >>>>>> +#include<syslog.h> > >>>>>> +#include "qemu_socket.h" > >>>>>> +#include "json-streamer.h" > >>>>>> +#include "json-parser.h" > >>>>>> +#include "qint.h" > >>>>>> +#include "qjson.h" > >>>>>> +#include "qga/guest-agent-core.h" > >>>>>> +#include "qga-qmp-commands.h" > >>>>>> +#include "module.h" > >>>>>> + > >>>>>> +#define QGA_VIRTIO_PATH_DEFAULT > >>>>>> "/dev/virtio-ports/org.qemu.guest_agent" > >>>>>> +#define QGA_PIDFILE_DEFAULT "/var/run/qemu-va.pid" > >>>>>> +#define QGA_BAUDRATE_DEFAULT B38400 /* for isa-serial channels */ > >>>>>> +#define QGA_TIMEOUT_DEFAULT 30*1000 /* ms */ > >>>>>> + > >>>>>> +struct GAState { > >>>>>> + const char *proxy_path; > >>>>> > >>>>> Where is this used? > >>>>> > >>>> > >>>> Nowhere actually. Will remove. > >>>> > >>>>>> + JSONMessageParser parser; > >>>>>> + GMainLoop *main_loop; > >>>>>> + guint conn_id; > >>>>>> + GSocket *conn_sock; > >>>>>> + GIOChannel *conn_channel; > >>>>>> + guint listen_id; > >>>>>> + GSocket *listen_sock; > >>>>>> + GIOChannel *listen_channel; > >>>>>> + const char *path; > >>>>>> + const char *method; > >>>>>> + bool virtio; /* fastpath to check for virtio to deal with poll() > >>>>>> quirks */ > >>>>>> + GACommandState *command_state; > >>>>>> + GLogLevelFlags log_level; > >>>>>> + FILE *log_file; > >>>>>> + bool logging_enabled; > >>>>>> +}; > >>>>>> + > >>>>>> +static void usage(const char *cmd) > >>>>>> +{ > >>>>>> + printf( > >>>>>> +"Usage: %s -c<channel_opts>\n" > >>>>>> +"QEMU Guest Agent %s\n" > >>>>>> +"\n" > >>>>>> +" -c, --channel channel method: one of unix-connect, > >>>>>> virtio-serial, or\n" > >>>>>> +" isa-serial (virtio-serial is the default)\n" > >>>>>> +" -p, --path channel path (%s is the default for > >>>>>> virtio-serial)\n" > >>>>>> +" -l, --logfile set logfile path, logs to stderr by default\n" > >>>>>> +" -f, --pidfile specify pidfile (default is %s)\n" > >>>>>> +" -v, --verbose log extra debugging information\n" > >>>>>> +" -V, --version print version information and exit\n" > >>>>>> +" -d, --daemonize become a daemon\n" > >>>>>> +" -h, --help display this help and exit\n" > >>>>>> +"\n" > >>>>>> +"Report bugs to<mdr...@linux.vnet.ibm.com>\n" > >>>>>> + , cmd, QGA_VERSION, QGA_VIRTIO_PATH_DEFAULT, QGA_PIDFILE_DEFAULT); > >>>>>> +} > >>>>>> + > >>>>>> +static void conn_channel_close(GAState *s); > >>>>>> + > >>>>>> +static const char *ga_log_level_str(GLogLevelFlags level) > >>>>>> +{ > >>>>>> + switch (level& G_LOG_LEVEL_MASK) { > >>>>>> + case G_LOG_LEVEL_ERROR: > >>>>>> + return "error"; > >>>>>> + case G_LOG_LEVEL_CRITICAL: > >>>>>> + return "critical"; > >>>>>> + case G_LOG_LEVEL_WARNING: > >>>>>> + return "warning"; > >>>>>> + case G_LOG_LEVEL_MESSAGE: > >>>>>> + return "message"; > >>>>>> + case G_LOG_LEVEL_INFO: > >>>>>> + return "info"; > >>>>>> + case G_LOG_LEVEL_DEBUG: > >>>>>> + return "debug"; > >>>>>> + default: > >>>>>> + return "user"; > >>>>>> + } > >>>>>> +} > >>>>>> + > >>>>>> +bool ga_logging_enabled(GAState *s) > >>>>>> +{ > >>>>>> + return s->logging_enabled; > >>>>>> +} > >>>>>> + > >>>>>> +void ga_disable_logging(GAState *s) > >>>>>> +{ > >>>>>> + s->logging_enabled = false; > >>>>>> +} > >>>>>> + > >>>>>> +void ga_enable_logging(GAState *s) > >>>>>> +{ > >>>>>> + s->logging_enabled = true; > >>>>>> +} > >>>>> > >>>>> Just to check I got this right, this is needed because of the fsfreeze > >>>>> command, correct? Isn't it better to have a more descriptive name, like > >>>>> fsfrozen? > >>>>> > >>>>> First I thought this was about a log file. Then I realized this was > >>>>> probably about letting the user log in, but we don't seem to have this > >>>>> functionality so I got confused. > >>>>> > >>>> > >>>> Yup, this is currently due to fsfreeze support. From the perspective of > >>>> the fsfreeze command the explicit "is_frozen" check makes more sense, > >>>> but the reason it affects other RPCs is because because we can't log > >>>> stuff in that state. If an RPC shoots itself in the foot by writing to a > >>>> frozen filesystem we don't really care so much, and up until recently > >>>> that case was handled with a pthread_cancel timeout mechanism (was > >>>> removed for the time being, will re-implement using a child process most > >>>> likely). > >>>> > >>>> What we don't want to do is give a host a way to bypass the expectation > >>>> we set for guest owners by allowing RPCs to be logged. So that's what > >>>> the check is based on, rather than lower level details like *why* > >>>> logging is disabled. Also, I'd really like to avoid a precedence where a > >>>> single RPC can place restrictions on all the others, so the logging > >>>> aspect seemed general enough that it doesn't necessarily provide that > >>>> precedence. > >>>> > >>>> This has come up a few times without any real consensus. I can probably > >>>> go either way, but I think the check for logging is easier to set > >>>> expectations with: "if logging is important from an auditing > >>>> perspective, don't execute this if logging is disabled". beyond that, > >>>> same behavior/symantics you'd get with anything that causes a write() on > >>>> a filesystem in a frozen state: block. > >>> > >>> While I understand your arguments, I still find the current behavior > >>> confusing: you issue a guest-open-file command and get a QgaLoggingFailed > >>> error. The first question is: what does a log write fail have to do with > >>> me > >>> opening a file? The second question is: what should I do? Try again? Give > >>> up? > >>> > >> > >> I agree it's a better solution for the client there, but at the same time: > >> > >> guest_privileged_ping(): > >> if fsfreeze.status == FROZEN: > >> syslog("privileged ping, thought you should know") > >> else: > >> return "error, filesystem frozen" > >> > >> Seems silly to the client as well. "Why does a ping command care about > >> the filesystem state?" > >> > >> Inability to log is the true error. That's also the case for the > >> guest-file-open command. > > > > There's a difference. In the guest ping the inability to log is an > > internal agent error, it's not interesting to the client. We could just > > ignore the error and reply back (unless we define that guest-privileged-ping > > requires writing to disk). > > > > To clarify, these's 2 different types of errors here and their relation > to fsfreeze is causing the separation to get munged a bit: > > 1) Logging/accounting errors: even though we try to make it clear > wherever possible this solution is based on a "trusted" hypervisor that > already has substantial privileges over guests (direct access to images, > etc), one of our internal use cases is the ability for customers to > properly account for actions initiated by the guest agent. Shutdowns, > whats files were read/written, etc. This is fine. > For some commands, this accounting is not important. guest-ping, or > guest-info, for instance, is uninteresting from a security accounting > perspective. So logging is in fact optional and logging failures are > silently ignored. > > guest-shutdown, guest-file-open, guest-fsfreeze, however, are important. > So the QgaLoggingError we currently report really means "this command > requires logging, but logging has been disabled". You're right that this > is because of fsfreeze, but it's not because of the filesystem state. > guest-file-open on a network mount that wasn't frozen would *still*, by > design, report an error due to inability to log. Completely ignoring the file-system state and considering only what you say about security, this doesn't make sense to me. Actually, I think it's a flaw, because a client could open the daemon's log file and write garbage on it. But, how does the ability to log protects you from anything? I would understand it's importance if the user had to provide credentials to use the agent, but s/he doesn't. It's like you were unable to use a linux box because syslogd is not running. This looks like a misfeature to me. If you really want to provide it, then you could provide a --enable-log-error which, when enabled, would make the daemon not to execute *any* command if the it's able to log its actions. Otherwise log errors are just ignored. > For this case I do see your point about the error not being very helpful > to users, which is why I think something like: > > "Guest agent failed to log RPC due to frozen filesystems" > > Is the best way to report it. If we need to fix up the error id to > something like QgaLoggingToFrozenFilesystem I'd be fine with that. > > 2) EWOULDBLOCK or I/O errors due to writing to frozen filesystems: this > is currently treated as a PEBKAC and not enforced/reported at all. I > wouldn't be against disabling guest-file-write as a side-effect of > freezing, but that's not totally correct. Writes to non-frozen > filesystems like networked or sysfs mounts is technically still okay, > for instance. We store path information in guest-file-open and use it to > scan against fsfreeze's state info about frozen mounts to handle this a > little better. Yes, I agree with you here. > > Even then you still have RPCs like guest-shutdown that may do a syslog() > that would cause the command to freeze, or in the future we may add the > ability for the guest agent to execute user/distro-installed scripts > that may/may not need to write to the filesystem. Some these might even > be intended to run when the filesystem is frozen...cleanup scripts for > databases and whatnot for snapshotting is something I've seen come up. > > We can > > a) be very restrictive, which I'm not totally against...my initial > inclination was to disable everything except > guest-fsfreeze-thaw/guest-fsfreeze-status while frozen, but this has > some limitations that aren't as obscure/unlikely as one would hope. > > b) we can be completely unrestrictive about it and give users the same > experiences they'd get at a command-line if they, or another user on the > system, was messing with fsfreeze. > > c) try to capture some common cases, but make it clear that you can > still shoot yourself in the foot using the guest agent on a frozen > filesystem, evaluating when to enforce this in code on a case-by-case basis. > > Personally I like b), mostly because it's the least work, but also > because it's actually the easiest way to set expectations about fsfreeze. > > > The guest-file-write command, on the other hand, clearly requires to write > > to disk, so a client would expect a EWOULDBLOCK error. > > > > EWOULDBLOCK looks good to me. We could define as general rule that > > commands don't block, so clients should always expect a EWOULDBLOCK. > > > > This falls under 2) > > >> There's nothing wrong with opening a read-only > >> file on a frozen filesystem, it's the fact that we can't log the open > >> that causes us to bail out.. > > > > But why do we bail out? Why can't we just ignore the fact we can't log? > > > > This falls under 1) > > >> So, what if we just munge the 2 to give the user the proper clues to fix > >> things, and instead return an error like: > >> > >> "Guest agent failed to log RPC (is the filesystem frozen?)"? > > > > The 'desc' part of an error is a human error descriptor, clients shouldn't > > parse it. The real error is the error class. > > > > As mentioned above, we could change the error id itself to capture both > the immediate error and the underlying error. >