On Wed, Feb 01, 2017 at 08:25:10PM +0000, Marc-André Lureau wrote: > Hi > > On Wed, Feb 1, 2017 at 8:26 PM Stefan Hajnoczi <stefa...@gmail.com> wrote: > > > On Mon, Jan 30, 2017 at 01:18:16PM -0500, Marc-André Lureau wrote: > > > Hi > > > > > > ----- Original Message ----- > > > > On Tue, Jan 24, 2017 at 01:43:17PM -0500, Marc-André Lureau wrote: > > > > > Hi > > > > > > > > > > ----- Original Message ----- > > > > > > On Mon, Jan 23, 2017 at 06:27:29AM -0500, Marc-André Lureau wrote: > > > > > > > ----- Original Message ----- > > > > > > > > On Wed, Jan 18, 2017 at 08:03:07PM +0400, Marc-André Lureau > > wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > CCing Jeff Cody and John Snow, who have been working on > > generalizing > > > > > > > > Block Job APIs to generic background jobs. There is some > > overlap > > > > > > > > between async commands and background jobs. > > > > > > > > > > > > > > If you say so :) Did I miss a proposal or a discussion for async > > qmp > > > > > > > commands? > > > > > > > > > > > > There is no recent mailing list thread, so it's probably best to > > discuss > > > > > > here: > > > > > > > > > > > > The goal of jobs is to support long-running operations that can be > > > > > > managed via QMP. Jobs can have a more elaborate lifecycle than > > just > > > > > > start -> finish/cancel (e.g. they can be paused/resumed and may > > have > > > > > > multiple phases of execution that the client controls). There are > > QMP > > > > > > APIs to query their state (Are they running? How much "progress" > > has > > > > > > been made?). > > > > > > > > > > Indeed, I mention that in my cover. Such use cases require something > > more > > > > > complete than simple async qmp commands. I don't see why it would be > > > > > incompatible with the usage of async qmp commands. > > > > > > > > > > > A client reconnecting to QEMU can query running jobs. This way a > > client > > > > > > can resume with a running QEMU process. For commands like saving a > > > > > > screenshot is mostly does not matter, but for commands that modify > > state > > > > > > it's critical that clients are aware of running commands after > > reconnect > > > > > > to prevent corruption/interference. This behavior is what I asked > > about > > > > > > in my previous mail. > > > > > > > > > > That's what I mention in the cover, some commands are global (and > > > > > broadcasted events are appropriate) and some are local to the client > > > > > context. Some could be discarded when the client disconnects etc. > > It's a > > > > > case by case. > > > > > > > > > > > Jobs are currently only used by the block layer and called "block > > jobs", > > > > > > but the idea is to generalize this. They use synchronous QMP + > > events. > > > > > > > > > > That pattern will have the flaws I mentioned (empty return, broadcast > > > > > events, id conflict, qapi semantic & documentation etc). Something > > new can > > > > > be invented, but it will likely make the protocol more complicated > > > > > compared to the solution I proposed (which is optional btw, and > > gracefully > > > > > fallbacks to sync processing for clients that do not support the > > async qmp > > > > > capability). However, I believe the job interface could be built on > > top of > > > > > what I propose. > > > > > > > > > > > Jobs are more heavy-weight than async QMP commands, but > > pause/resume, > > > > > > rate-limiting, progress reporting, robust reconnect, etc are > > important > > > > > > features. Users want to be aware of long-running operations and > > have > > > > > > the ability to control them. > > > > > > > > > > You can't generalize such job interface to all async commands. Some > > may not > > > > > implement the ability to report progress, to cancel, to pause etc, > > etc. In > > > > > the end, it will be complicated and unneeded in many cases (what's > > the use > > > > > case to pause or to get the progress of a screendump?). What I > > propose is > > > > > simpler and compatible with job/task interfaces appropriate for > > various > > > > > domains. > > > > > > > > > > > I suspect that if we transition synchronous QMP commands to async > > we'll > > > > > > soon have requirements for progress reporting, pause/resume, etc. > > So is > > > > > > there a set of commands that should be async and others that > > should be > > > > > > jobs or should everything just be a job? > > > > > > > > > > Hard to say without a concrete proposal of what "job" is. Likely, > > > > > everything is not going to be a "job". > > > > > > > > > > But hopefully qmp-async and jobs can co-exist and benefit from each > > other. > > > > > > > > My concern with this series is that background operations must be > > > > observable and there must be a way to cancel them. Otherwise > > management > > > > tools cannot do their job and it's hard to troubleshoot a misbehaving > > > > system because you can't answer the question "what's going on?". Once > > > > you add that then a large chunk of block jobs is duplicated. > > > > > > Tracking ongoing operations can also be done at management layer. If > > needed, we could add qmp-commands to list on-going commands (their ids > > etc), and add commands to cancel them. But then again, not all operations > > will be cancellable, and I am not sure having requirements to list or > > cancel or modify all on-going operation is needed (I would say no, just > > like today you can't do anything while a command is running) > > > > It cannot be done by robustly by the client. If the client crashes then > > there's no way of knowing what pending commands are running. Requiring > > the client to keep a journal would force every client that wants to be > > robust and easy to troubleshoot to duplicate this and IMO isn't a > > solution. > > > > > My proposal allows for commands to be cancelled when the client is gone. > And we can quite easily provide a qmp command to list on-going commands, I > can add a patch for that. > > There is no per-client context as of today, so recovering from on-going job > would conflict with other clients (there is no per client id or job-id > namespace neither). I don't know if there is a way to enforce only a single > qmp client today, I would have to check. > > QEMU knows which commands are in-flight, it should be able to report > > this info. It's important for troubleshooting. > > > > I agree that it's not important today since only one command runs at a > > time (except block jobs and migration, which do have commands to query > > their status). But the nature of async commands means that they can run > > in the background for a long time, so it will be necessary. > > > > > If needed, it can be added with this proposal. I will add a > proof-of-concept patch in the next iteration.
Great. Stefan
signature.asc
Description: PGP signature