Re: [PATCH v5 10/10] migration: introduce snapshot-{save, load, delete} QMP commands

Eric Blake Fri, 02 Oct 2020 12:47:27 -0700

On 10/2/20 11:27 AM, Daniel P. Berrangé wrote:
> savevm, loadvm and delvm are some of the few HMP commands that have never
> been converted to use QMP. The reasons for the lack of conversion are
> that they blocked execution of the event thread, and the semantics
> around choice of disks were ill-defined.
> 
> Despite this downside, however, libvirt and applications using libvirt
> have used these commands for as long as QMP has existed, via the
> "human-monitor-command" passthrough command. IOW, while it is clearly
> desirable to be able to fix the problems, they are not a blocker to
> all real world usage.
> 
> Meanwhile there is a need for other features which involve adding new
> parameters to the commands. This is possible with HMP passthrough, but
> it provides no reliable way for apps to introspect features, so using
> QAPI modelling is highly desirable.
> 
> This patch thus introduces new snapshot-{load,save,delete} commands to
> QMP that are intended to replace the old HMP counterparts. The new
> commands are given different names, because they will be using the new
> QEMU job framework and thus will have diverging behaviour from the HMP
> originals. It would thus be misleading to keep the same name.
> 
> While this design uses the generic job framework, the current impl is
> still blocking. The intention that the blocking problem is fixed later.
> None the less applications using these new commands should assume that
> they are asynchronous and thus wait for the job status change event to
> indicate completion.
> 
> In addition to using the job framework, the new commands require the
> caller to be explicit about all the block device nodes used in the
> snapshot operations, with no built-in default heuristics in use.
> 
> Signed-off-by: Daniel P. Berrangé <berra...@redhat.com>
> ---


> +++ b/qapi/job.json
> @@ -22,10 +22,17 @@
>  #
>  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>  #
> +# @snapshot-load: snapshot load job type, see "snapshot-load" (since 5.2)
> +#
> +# @snapshot-save: snapshot save job type, see "snapshot-save" (since 5.2)
> +#
> +# @snapshot-delete: snapshot delete job type, see "snapshot-delete" (since 
> 5.2)
> +#
>  # Since: 1.7
>  ##
>  { 'enum': 'JobType',
> -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
> +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
> +           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>  
>  ##
>  # @JobStatus:
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 7f5e6fd681..d2bd551ad9 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1787,3 +1787,123 @@
>  # Since: 5.2
>  ##
>  { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' }
> +
> +##
> +# @snapshot-save:
> +#
> +# Save a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to create
> +# @devices: list of block device node names to save a snapshot to
> +# @vmstate: block device node name to save vmstate to

Here, you document vmstate last,...

> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns. The job commands / events must be used
> +# to determine completion and to fetch details of any errors that arise.
> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# save the snapshot

"will be", or "may be"?  As you stated above, we may be able to lift the
synchronous limitations down the road, while still maintaining the
present interface of using this command to start the job and waiting on
the job id until it is finished, at which point the CPUs might not need
to be paused as much.

> +#
> +# It is strongly recommended that @devices contain all writable
> +# block device nodes if a consistent snapshot is required.
> +#
> +# If @tag already exists, an error will be reported
> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-save",
> +#      "data": {
> +#         "job-id": "snapsave0",
> +#         "tag": "my-snap",
> +#         "vmstate": "disk0",
> +#         "devices": ["disk0", "disk1"]

...here vmstate occurs before devices.  I don't know if our doc
generator cares about inconsistent ordering.

> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-save',
> +  'data': { 'job-id': 'str',
> +            'tag': 'str',
> +            'vmstate': 'str',
> +            'devices': ['str'] } }
> +
> +##
> +# @snapshot-load:
> +#
> +# Load a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to load.
> +# @devices: list of block device node names to load a snapshot from
> +# @vmstate: block device node name to load vmstate from
> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns. The job commands / events must be used
> +# to determine completion and to fetch details of any errors that arise.

s/save/load/

> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# save the snapshot

s/save/load/

But while pausing CPUs during save is annoying, pausing CPUs during
restore makes sense (after all, executing on stale data that will still
be updated during the restore is just wasted execution).


> +#
> +# It is strongly recommended that @devices contain all writable
> +# block device nodes that can have changed since the original
> +# @snapshot-save command execution.
> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-load",
> +#      "data": {
> +#         "job-id": "snapload0",
> +#         "tag": "my-snap",
> +#         "vmstate": "disk0",
> +#         "devices": ["disk0", "disk1"]
> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-load',
> +  'data': { 'job-id': 'str',
> +            'tag': 'str',
> +            'vmstate': 'str',
> +            'devices': ['str'] } }
> +
> +##
> +# @snapshot-delete:
> +#
> +# Delete a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to delete.
> +# @devices: list of block device node names to delete a snapshot from
> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns. The job commands / events must be used
> +# to determine completion and to fetch details of any errors that arise.

Do we have a query- command handy to easily learn which snapshot names
are even available to attempt deletion on?  If not, that's worth a
separate patch.

> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-delete",
> +#      "data": {
> +#         "job-id": "snapdelete0",
> +#         "tag": "my-snap",
> +#         "devices": ["disk0", "disk1"]
> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##

> +++ b/tests/qemu-iotests/group
> @@ -291,6 +291,7 @@
>  277 rw quick
>  279 rw backing quick
>  280 rw migration quick
> +310 rw quick
>  281 rw quick
>  282 rw img quick
>  283 auto quick

What's wrong with sorted order? I get the renumbering to appease a merge
conflict, but it also requires rearrangement ;)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v5 10/10] migration: introduce snapshot-{save, load, delete} QMP commands

Reply via email to