Migration of a VFIO passthrough device can be supported by using a device specific kernel driver to save/restore the device state thru device specific interfaces. But this approach doesn't work for devices that lack a state migration interface, e.g. NVMe.
On the other hand, Infrastructure Process Unit (IPU) or Data Processing Unit (DPU) vendors may choose to implement an out-of-band interface from the SoC to help manage the state of such non-migratable devices e.g. via gRPC or JSON-RPC protocols. This RFC attempts to support such out-of-band migration interface by introducing the concept of migration backends in vfio. The existing logic around vfio migration uAPI is now called the 'local' backend while a new 'out-of-band' backend is further introduced allowing vfio to redirect VMState ops to an external plugin. Currently, the backend migration Ops is defined close to SaveVMHandlers. We also considered whether there is value of abstracting it in a lower level e.g. close to vfio migration uAPI but no clear conclusion. Hence this is one part which we'd like to hear suggestions. This proposal adopts a plugin mechanism (an example can be found in [1]) given that IPU/DPU vendors usually implement proprietary migration interfaces without a standard. But we are also open if an alternative option makes better sense, e.g. via loadable modules (with Qemu supporting gRPC or JSON-RPC support) or an IPC mechanism similar to vhost-user. The following graph describes the overall component relationship: +----------------------------------------------------+ | QEMU | | +------------------------------------------------+ | | | VFIO Live Migration Framework | | | | +--------------------------------------+ | | | | | VFIOMigrationOps | | | | | +-------^---------------------^--------+ | | | | | | | | | | +-------v-------+ +-------v--------+ | | | | | LM Backend Via| | LM Backend Via | | | | | | Device Fd | | Plugins | | | | | +-------^-------+ | +----------+ | | | | | | |Plugin Ops+----+-+------------+ | | | +-----+----------+ | | | | | | | | +---------v----------+ | +------------+-----------------------------------+ | | Vendor Specific | | | | | Plugins(.so) | +--------------+-------------------------------------+ +----------+---------+ UserSpace | | ----------------+--------------------------------------------- | Kernel | | | | +----------v----------------------+ | | Kernel VFIO Driver | | | +-------------------------+ | | | | | | | Network | | Vendor-Specific Driver | | | | | | | | | +----------^--------------+ | | | | | | +---------------+-----------------+ | | | | | ---------------------+----------------------------------------- | Hardware | | | +-----+-----+-----+----+-----+ | +----------v------+ | VF0 | VF1 | VF2 | ...| VFn | | | Traditional | +-----+-----+-----+----+-----+ | | PCIe Devices | | | | +-----------------+ | +--------+------------+ | | | | | Agent |<-+----+ | | +------------+ | | | | | | | SOC | | | +---------------------+ | | IPU | +----------------------------+ Two command-line parameters (x-plugin-path and x-plugin-arg) are introduced to enable the out-of-band backend. If specified, vfio will attempt to use the out-of-band backend. The following is an example of VFIO command-line parameters for OOB-Approach: -device vfio-pci,id=$ID,host=$bdf,x-enable-migration,x-plugin-path=$plugin_path,x-plugin-arg=$plugin_arg [1] https://github.com/raolei-intel/vfio-lm-plugin-example.git Lei Rao (13): vfio/migration: put together checks of migration initialization conditions vfio/migration: move migration struct allocation out of vfio_migration_init vfio/migration: move vfio_get_dev_region_info out of vfio_migration_probe vfio/migration: Separated functions that relate to the In-Band approach vfio/migration: rename functions that relate to the In-Band approach vfio/migration: introduce VFIOMigrationOps layer in VFIO live migration framework vfio/migration: move the statistics of bytes_transferred to generic VFIO migration layer vfio/migration: split migration handler registering from vfio_migration_init vfio/migration: move the functions of In-Band approach to a new file vfio/pci: introduce command-line parameters to specify migration method vfio/migration: add a plugin layer to support out-of-band live migration vfio/migration: add some trace-events for vfio migration plugin vfio/migration: make the region and plugin member of struct VFIOMigration to be a union docs/devel/vfio-migration-plugin.rst | 165 +++++++ hw/vfio/meson.build | 2 + hw/vfio/migration-local.c | 456 +++++++++++++++++++ hw/vfio/migration-plugin.c | 266 +++++++++++ hw/vfio/migration.c | 577 ++++++------------------ hw/vfio/pci.c | 2 + hw/vfio/trace-events | 9 +- include/hw/vfio/vfio-common.h | 37 +- include/hw/vfio/vfio-migration-plugin.h | 21 + 9 files changed, 1096 insertions(+), 439 deletions(-) create mode 100644 docs/devel/vfio-migration-plugin.rst create mode 100644 hw/vfio/migration-local.c create mode 100644 hw/vfio/migration-plugin.c create mode 100644 include/hw/vfio/vfio-migration-plugin.h -- 2.32.0