On Tue, Sep 04, 2018 at 10:04:00AM +0200, Markus Armbruster wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Mon, Sep 03, 2018 at 03:41:16PM +0100, Daniel P. Berrangé wrote: > >> On Mon, Sep 03, 2018 at 09:30:52AM -0500, Eric Blake wrote: > >> > On 09/03/2018 08:31 AM, Markus Armbruster wrote: > >> > > >> > > Example: > >> > > > >> > > client sends in-band command #1 > >> > > QEMU reads and queues > >> > > QEMU dequeues in-band command #1 > >> > > in-band command #1 starts executing, but it's slooow > >> > > client sends in-band command #2 > >> > > QEMU reads and queues > >> > > ... > >> > > client sends in-band command #8 > >> > > QEMU reads, queues and suspends the monitor > >> > > client sends out-of-band command > >> > > --> time passes... > >> > > in-band command #1 completes, QEMU sends reply > >> > > QEMU dequeues in-band command #2, resumes the monitor > >> > > in-band command #2 starts executing > >> > > QEMU reads and executes out-of-band command > >> > > out-of-band command completes, QEMU sends reply > >> > > in-band command #2 completes, QEMU sends reply > >> > > ... same for remaining in-band commands ... > >> > > > >> > > The out-of-band command gets stuck behind the in-band commands. > > > > (It's a shame of me to have just noticed that the out-of-band command > > will be stuck after we dropped the COMMAND_DROP event... so now I > > agree it's not that ideal any more to drop the event but maybe still > > preferable) > > We can queue without limit, we can drop commands, or we can suspend > reading. Each of these has drawbacks: > > * Queuing without limit is simple for the client, but unsafe for QEMU. > > * Dropping commands requires the client to cope with dropped commands. > As currently designed, it's just as unsafe for QEMU: instead of > queuing commands without limit, we get to queue their COMMAND_DROPPED > events without limit. A better design could avoid this flaw. > > * Suspending reading requires the client to manage the flow of commands > if it wants to keep the monitor available for out-of-band commands. > > We decided that clients having to manage the flow of commands is no > worse than clients having to cope with dropped commands. There's still > time to challenge this decision. > > This series acts upon the decision: it switches from dropping commands > to suspending reading. Makes the input direction safe. The output > direction remains as unsafe as it's always been. Fixing that is left > for later.
Yes. Options (1) and (2) seems not really acceptable for me, but my conclusion is based on that I think QEMU should still protect itself from the client. Take the example of QEMU & Libvirt: I think death or bug of either of the program should not affect the other one. But maybe I misunderstood somewhere since I saw that you emphasized it at [1] below... And for (3), I really think a proper client should never trigger that queue full state. Hopefully with that then the client would never lost the out-of-band feature due to a stuck input channel. > > >> > > > >> > > The client can avoid this by managing the flow of in-band commands: > >> > > have > >> > > no more than 7 in flight, so the monitor never gets suspended. > >> > > > >> > > This is a potentially useful thing to do for clients, isn't it? > >> > > > >> > > Eric, Daniel, is it something libvirt would do? > >> > > >> > Right now, libvirt serializes commands - it never sends a QMP command > >> > until > >> > the previous command's response has been processed. But that may not help > >> > much, since libvirt does not send OOB commands. > >> > >> Note that is not merely due to the QMP monitor restriction either. > >> > >> Libvirt serializes all its public APIs that can change state of a running > >> domain. It usually aims to allow read-only APIs to be run in parallel with > >> APIs that change state. > >> > >> The exception to the rule right now are some of the migration APIs which > >> we allow to be invoked to manage the migration process. > >> > >> > I guess when we are designing what libvirt should do, and deciding WHEN > >> > it > >> > should send OOB commands, we have the luxury of designing libvirt to > >> > enforce > >> > how many in-flight in-band commands it will ever have pending at once > >> > (whether the current 'at most 1', or even if we make it more parallel to > >> > 'at > >> > most 7'), so that we can still be ensured that the OOB command will be > >> > processed without being stuck in the queue of suspended in-band commands. > >> > If we never send more than one in-band at a time, then it's not a concern > >> > how deep the qemu queue is; but if we do want libvirt to start parallel > >> > in-band commands, then you are right that having a way to learn the qemu > >> > queue depth is programmatically more precise than just guessing the > >> > maximum > >> > depth. But it's also hard to argue we need that complexity if we don't > >> > have > >> > an immediate use envisioned for it. > >> > >> In terms of what libvirt would want to parallelize, I think it is > >> reasonable > >> to consider any of the query-XXXX commands desirable. Other stuff is likely > >> to remain serialized from libvirt's side. > > > > IMHO concurrency won't help much now even for query commands, since > > our current concurrency is still "partly" - the executions of query > > commands (which is the most time consuming part) will still be done > > sequentially, so even if we send multiple query commands in parallel > > (without waiting for a response of any sent commands), the total time > > used for the list of commands would be mostly the same. > > Yes. We execute all in-band commands (regardless of their monitor) in > the main thread. Out-of-band commands can execute in @mon_iothread, > which provides a modest degree of concurrency. > > > My understanding for why we have such a queue length now is that it > > came from a security concern: after we have a queue, we need that > > queue length to limit the memory usages for the QMP server. Though > > that might not help much for real users like Libvirt, it's majorly > > serving as a way to protect QEMU QMP from being attacked or from being > > turned down by a buggy QMP client. > > Yes. > > QEMU has to trust its QMP clients, so malice is not a concern, but > accidents are. Robust software does not buffer without bounds. [1] > > > But I agree now that the queue length information might still be > > helpful some day. Maybe, we can hide that until we support executing > > commands in parallel for some of them. > > Queue length can become interesting long before we get general > concurrency. > > If you use QMP only synchronously (send command #1; receive reply #1; > send command #2; ...), then out-of-band does exactly nothing for you. > To make use of it, you have to send an out-of-band command *before* you > receive the previous command's reply. That's a form of pipelining. Yes, out-of-band should be special here, but as Dave has already mentioned (possibly someone else too) that we may just need a length=1 queue for in-band command and length=1 queue for out-of-band command and that should be enough at least for now (say, oob command will never block, and oob commands will be executed once a time). By that extra length=1 out-of-band queue we gain the ability to talk to QMP any time we want when necessary (though with limited list of cmds). > > Note there's still no general concurrency. There's a bit of pipelining, > and there's a bit of concurrency between one in-band command (executing > in main thread) and out-of-band command (executing in @mon_iothread). > > Since we need to support a bit of pipelining anyway, why not support it > more generally? All it takes it raising the queue length limit above > the minimum required for the use of OOB I just sketched. > > Note that "since we need to support a bit of concurrency anyway, why not > support it more generally?" would be ludicrously naive :) Regards, -- Peter Xu