On 09.12.25 19:51, Peter Xu wrote:
On Thu, Dec 04, 2025 at 10:55:33PM +0300, Vladimir Sementsov-Ogievskiy wrote:
On 19.11.25 01:05, Peter Xu wrote:
On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
Add Daniel

On 10.11.25 13:39, Alexandr Moshkov wrote:
v3:
- use pre_load_errp instead of pre_load in vhost.c
- change vhost-user-blk property to
     "skip-get-vring-base-inflight-migration"
- refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher

v2:
- rewrite migration using VMSD instead of qemufile API
- add vhost-user-blk parameter instead of migration capability

I don't know if VMSD was used cleanly in migration implementation, so
feel free for comments.

Based on Vladimir's work:
[PATCH v2 00/25] vhost-user-blk: live-backend local migration
     which was based on:
       - [PATCH v4 0/7] chardev: postpone connect
         (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' 
options)
       - [PATCH v3 00/23] vhost refactoring and fixes
       - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler


Hi!

On my series about backend-transfer migration, the final consensus (or at least,
I hope that it's a consensus:) is that using device properties to control 
migration
channel content is wrong. And we should instead use migration parameters.

(discussion here: 
https://lore.kernel.org/qemu-devel/[email protected]/
 )

So the API for backend-transfer features is a migration parameter

      backend-transfer = [ list of QOM paths of devices, for which we want to 
enable backend-transfer ]

and user don't have to change device properties in runtime to setup the 
following migration.

So I assume, similar practice should be applied here: don't use device
properties to control migration.

So, should it be a parameter like

      migrate-inflight-region = [ list of QOM paths of vhost-user devices ]

?

I have concern that if we start doing this more, migration qapi/ will be
completely messed up.

Imagine a world where there'll be tons of lists like:

    migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
    migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
    migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
    ...


Yes, hard to argue against it.

I still hope, Daniel will share his opinion..

 From our side, we are OK with any interface, which is accepted)


Let me summarize in short the variants I see:

===

1. lists

Add migrations parameters for such features:

migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
backend-transfer = [ list of QOM paths of devices, which backend should be 
migrated ]

This way, we just need to set the same sets for source and target QEMU before 
migration,
and it have no relation to machine types.

PROS: Like any other migration-capability, setup what (and how) should migrate, 
no
relation to device properties and MT.

CONS: Logically, that's the same as add a device property, but instead we 
implement
lists of devices, and create extra QOM_PATH-links.

===

2. device parameters

Before migration we should loop through devices and call corresponding
qom-set commands, like

qom-set {path: QOM_PATH, property: "backend-transfer", "value": true}
qom-set {path: QOM_PATH, property: "migrate-inflight-region", "value": true}

And of course, we should care to set same values for corresponding devices on 
source
and target.

In this case, we also _can_ rely on machine types for defaults.

Note, that "migrate-inflight-region" may become the default in the 11.0 MT.
But backend-transfer can't be a default, as this way we'll break remote 
migration.

PROS: No lists, native properties

CONS: These properties does not define any portion of device state, internal or
visible to guest. It's not a property of device, but it's and option for 
migration
of that device.

===

2.1 = [2] assisted by one boolean migration-parameter

Still, if we want make backend-transfer "a kind of" default, we'll need one 
boolean
migration parameter "it-is-local-migration", and modify logic to

really_do_backend_transfer = it-is-local-migration and device.backend-transfer
really_do_migrate_inflight_region = not it-is-local-migration and 
device.migrate-inflight-region

PROS: starting from some MT, we'll have good defaults, so that user don't have
to enable/disable the option per device for every migration.

CONS: a precedent of the behavior driven by combination of device property and
corresponding migration parameter (or we have something similar?)

===

4. mixed

Keep [2] for this series, and [1] for backend-transfer.

PROS: list for backend-transfer remains "the only exclusion" instead of "the 
practice",
so we will not have tons of such lists :)

CONS: inconstant solutions for similar things

===

5. implement "per device" migration parameters

They may be set by additional QMP command qmp-migrate-set-device-parameters, 
which
will take additional qom-path parameter.

Or, we may add one list of structures like

[{
    qom_path: ...
    parameters: ..
}, ...]

into common migration parameters.

PROS: keep new features as a property of migration, but avoid several lists of 
QOM paths
CONS: ?

Hmm, we also may select devices not only by qom_path, but by type, for example, 
to enable
feature for all virtio-net devices. Hmm, and this type may be also used as 
discriminator
for parameters, which may be a QAPI union type..

===


Thoughts?

Sorry to respond late, I kept getting other things interrupting me when I
wanted to look at this..

I just sent a series here, allowing TYPE_OBJECT of any kind to be able to
work with machine compat properties:

https://lore.kernel.org/r/[email protected]

I still want to see if we can stick with compat properties in general
whenever it's about defining guest ABI.

What you proposed should work, but that'll involve a 2nd way of probing
"what is the guest ABI" by providing a new QMP query command and then set
them after mgmt queries both QEMUs then set the subset of both.  It will be
finer granule but as I discussed previously, I think it's re-inventing the
wheels, and it may cause mgmt over-bloated on caring too many trivial
details of per-device specific details.

Please have a look to see the feasibility.  As mentioned in the cover
letter, that will need further work to e.g. QOMify TAP first at least for
your series.  But I don't yet see it as a blocker?  After QOMified, it can
inherit directly the OBJECT_COMPAT then TAP can add compat properties.

I wonder if vhost-usr-blk can already use compat properties.


Yes, it can. And regardless of the way we chose: qdev properties or qapi,
I don't think we need a property for backend itself. We need a property
(or migration capability) for vhost-user-blk itself, saying that its
backend should be migrated.

It's a lot simpler to migrate backend inside of frontend state. If we
migrate backend in separate, we can't control the order of backend/frontend
stats, and will have to implement some late point in state load process,
where both are already loaded and we can do our post-load logic.




That doesn't look reasonable at all.  If some feature is likely only
supported in one device, that should not appear in migration.json but only
in the specific device.

I don't think I'm fully convinced we can't enable some form of machine type
properties (with QDEV or not) on backends we should stick with something
like that.  I can have some closer look this week, but.. even if not, I
still think migration shouldn't care about some specific behavior of a
specific device.

If we really want to have some way to probe device features, maybe we
should also think about a generic interface (rather than "one new list
every time").  We also have some recent discussions on a proper interface
to query TAP backend features like USO*.  Maybe they share some of the
goals here.

What do you mean by probing device features? Isn't it qom-get command?

--
Best regards,
Vladimir




--
Best regards,
Vladimir

Reply via email to