On 11/23/20 1:57 AM, Markus Armbruster wrote:
(Was it not possible to have the client send an ACK response that
doesn't indicate success, but just signals receipt? Semantically, it
would be the difference between do-x and start-x as a command.)
Feels quite possible to me.
If I read git-log correctly, the commands' design ignored the race with
shutdown / suspend, and 'success-response': false was a quick fix.
I think changing the commands from 'do-x' to 'start-x' would have been
better. A bit more work, though, and back then there weren't any
'start-x' examples to follow, I guess.
I wish success-response didn't exist, but is getting rid of it worth
changing the interface? I honestly don't know.
OK, I'll jot it down in the ideas book and we can come back to it later,
after we've fried the other 2,385 fish in queue. I think simplifying the
interface is going to be helpful long-term, but I don't have a good grip
on the actual work estimate.
In terms of a simple library design, this edge case makes an async
library quite a bit worse to support, because now it needs to understand
that sometimes the target will never reply, and it's command dependent.
Now the core "QMP" library, ostensibly the command-agnostic layer, needs
to query to discover the semantics of each command it sends.
(1) Identify 'success-response: false' commands:
- guest-shutdown
- guest-suspend-disk
- guest-suspend-ram
- guest-suspend-hybrid
(2) Replace them with jobs, where a success response acknowledges
receipt and start of the job. Errors encountered setting up the job can
be returned synchronously; errors encountered past the point of no
return can be queried via the job system.
(I am assuming that it is untenable to have a system where we
acknowledge receipt, go past the point of no return and then encounter
an error and have no way to report it, since we've already responded to
the shutdown/suspend request. Therefore, the job system seems appropriate.)
((The onus is now on the job layer to understand that on job success,
the server goes away and cannot report success, but I think this is
easier to implement in a client at the logical layer than at the
protocol layer.))
(3) Deprecate the old interfaces.
(4) Delete the old interfaces. Remove 'success-response: false' from the
meta-schema. Remove 'success-response' from GuestAgentCommandInfo.
...but, not terribly important. Might be nice to help plumb some
non-block jobs for sake of example for other areas where they might be
useful. Other stuff to think about in the meantime, but thank you for
the heads up. Maybe a good GSoC project, actually? It seems
straightforward, just with a lot of plumbing.
I suppose for now, what can happen is the client (using the AQMP
library) can simply await the results of execute() with a timeout. If
the server closes the connection, we know it worked -- or, if there's a
timeout and we used --no-shutdown, we can interrogate the VM status.