On 5/21/19 10:17 AM, Markus Armbruster wrote:
> Marc-André, before you invest your time to answer my questions below: my
> bandwidth for non-trivial QAPI features like this one is painfully
> limited. To get your QAPI conditionals in, I had to postpone other QAPI
> projects. I don't regret doing that, I'm rather pleased with how QAPI
> conditionals turned out. But I can't keep postponing all other QAPI
> projects. Because of that, this one will be slow going, at best. Sorry
> about that.
>
> Marc-André Lureau <marcandre.lur...@redhat.com> writes:
>
>> Hi,
>>
>> HMP and QMP commands are handled synchronously in qemu today. But
>> there are benefits allowing the command handler to re-enter the main
>> loop if the command cannot be handled synchronously, or if it is
>> long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
>> without it.
>>
>> The common solution is to use a pair of command+event in this case.
>
> In particular, background jobs (qapi/jobs.json). They grew out of block
> jobs, and are still used only for "blocky" things. Using them more
> widely would probably make sense.
> >> But this approach has a number of issues:
>> - you can't "fix" an existing command: you need a new API, and ad-hoc
>> documentation for that command+signal association, and old/broken
>> command deprecation
>
> Making a synchronous command asynchronous is an incompatible change. We
> need to let the client needs opt in. How is that done in this series?
>
>> - since the reply event is broadcasted and 'id' is used for matching the
>> request, it may conflict with other clients request 'id' space
>
> Any event that does that now is broken and needs to be fixed. The
> obvious fix is to include a monitor ID with the command ID. For events
> that can only ever be useful in the context of one particular monitor,
> we could unicast to that monitor instead; see below.
>
> Corollary: this is just a fixable bug, not a fundamental advantage of
> the async feature.
>
>> - it is arguably less efficient and elegant (weird API, useless return
>> in most cases, broadcast reply, no cancelling on disconnect etc)
>
> The return value is useful for synchronously reporting failure to start
> the background task. I grant you that background tasks may exist that
> won't ever fail to start. I challenge the idea that it's most of them.
>
> Broadcast reply could be avoided by unicasting events. If I remember
> correctly, Peter Xu even posted patches some time ago. We ended up not
> using them, because we found a better solution for the problem at hand.
> My point is: this isn't a fundamental problem, it's just the way we
> coded things up.
>
> What do you mean by "no cancelling on disconnect"?
>
> I'm ignoring "etc" unless you expand it into something specific.
>
> I'm also not taking the "weird" bait :)
>
>> The following series implements an async command solution instead. By
>> introducing a session context and a command return handler, it can:
>> - defer the return, allowing the mainloop to reenter
>> - return only to the caller (instead of broadcast events for reply)
>> - optionnally allow cancellation when the client is gone
>> - track on-going qapi command(s) per client/session
>>
>> and without introduction of new QMP APIs or client visible change.
>
> What do async commands provide that jobs lack?
>
> Why do we want both?
>
> I started to write a feature-by-feature comparison, but realized I don't
> have the time to figure out either jobs or async from their (rather
> sparse) documentation, let alone from code.
>
Sorry about that. I still have a todo item from you in my inbox to
improve the documentation there, but I've been focusing on bitmaps
documentation lately instead, but I hope to branch it out as part of my
caring a bit more about docs lately.
I'll keep an eye out here. I don't really want to prohibit things on the
sole basis that they aren't jobs, but I do want to make sure that adding
a new lifecycle paradigm for commands doesn't complicate the jobs code
in accidental ways.
I'll try to look this over too, but I have a bit of a backlog right now.
>> Existing qemu commands can be gradually replaced by async:true
>> variants when needed, while carefully reviewing the concurrency
>> aspects. The async:true commands marshaller helpers are splitted in
>> half, the calling and return functions. The command is called with a
>> QmpReturn context, that can return immediately or later, using the
>> generated return helper, which allows for a step-by-step conversion.
>>
>> The screendump command is converted to an async:true version to solve
>> rhbz#1230527. The command shows basic cancellation (this could be
>> extended if needed). It could be further improved to do asynchronous
>> IO writes as well.
>
> What is "basic cancellation"?
>
> What extension(s) do you have in mind?
>
> What's the impact of screendump writing synchronously?
>