On 20/11/2018 21:39, Kirti Wankhede wrote:
- Defined MIGRATION region type and sub-type.
- Defined VFIO device states during migration process.
- Defined vfio_device_migration_info structure which will be placed at 0th
offset of migration region to get/set VFIO device related information.
Defined actions and members of structure usage for each action:
* To convey VFIO device state to be transitioned to.
* To get pending bytes yet to be migrated for VFIO device
* To ask driver to write data to migration region and return number of
bytes
written in the region
* In migration resume path, user space app writes to migration region and
communicates it to vendor driver.
* Get bitmap of dirty pages from vendor driver from given start address
Signed-off-by: Kirti Wankhede <kwankh...@nvidia.com>
Reviewed-by: Neo Jia <c...@nvidia.com>
---
linux-headers/linux/vfio.h | 130 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 130 insertions(+)
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 3615a269d378..a6e45cb2cae2 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -301,6 +301,10 @@ struct vfio_region_info_cap_type {
#define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG (2)
#define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG (3)
+/* Migration region type and sub-type */
+#define VFIO_REGION_TYPE_MIGRATION (1 << 30)
+#define VFIO_REGION_SUBTYPE_MIGRATION (1)
+
/*
* The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
* which allows direct access to non-MSIX registers which happened to be
within
@@ -602,6 +606,132 @@ struct vfio_device_ioeventfd {
#define VFIO_DEVICE_IOEVENTFD _IO(VFIO_TYPE, VFIO_BASE + 16)
+/**
+ * VFIO device states :
+ * VFIO User space application should set the device state to indicate vendor
+ * driver in which state the VFIO device should transitioned.
+ * - VFIO_DEVICE_STATE_NONE:
+ * State when VFIO device is initialized but not yet running.
+ * - VFIO_DEVICE_STATE_RUNNING:
+ * Transition VFIO device in running state, that is, user space application
or
+ * VM is active.
+ * - VFIO_DEVICE_STATE_MIGRATION_SETUP:
+ * Transition VFIO device in migration setup state. This is used to prepare
+ * VFIO device for migration while application or VM and vCPUs are still in
+ * running state.
+ * - VFIO_DEVICE_STATE_MIGRATION_PRECOPY:
+ * When VFIO user space application or VM is active and vCPUs are running,
+ * transition VFIO device in pre-copy state.
+ * - VFIO_DEVICE_STATE_MIGRATION_STOPNCOPY:
+ * When VFIO user space application or VM is stopped and vCPUs are halted,
+ * transition VFIO device in stop-and-copy state.
+ * - VFIO_DEVICE_STATE_MIGRATION_SAVE_COMPLETED:
+ * When VFIO user space application has copied data provided by vendor
driver.
+ * This state is used by vendor driver to clean up all software state that
was
+ * setup during MIGRATION_SETUP state.
+ * - VFIO_DEVICE_STATE_MIGRATION_RESUME:
+ * Transition VFIO device to resume state, that is, start resuming VFIO
device
+ * when user space application or VM is not running and vCPUs are halted.
+ * - VFIO_DEVICE_STATE_MIGRATION_RESUME_COMPLETED:
+ * When user space application completes iterations of providing device state
+ * data, transition device in resume completed state.
+ * - VFIO_DEVICE_STATE_MIGRATION_FAILED:
+ * Migration process failed due to some reason, transition device to failed
+ * state. If migration process fails while saving at source, resume device at
+ * source. If migration process fails while resuming application or VM at
+ * destination, stop restoration at destination and resume at source.
+ * - VFIO_DEVICE_STATE_MIGRATION_CANCELLED:
+ * User space application has cancelled migration process either for some
+ * known reason or due to user's intervention. Transition device to Cancelled
+ * state, that is, resume device state as it was during running state at
+ * source.
+ */
+
+enum {
+ VFIO_DEVICE_STATE_NONE,
+ VFIO_DEVICE_STATE_RUNNING,
+ VFIO_DEVICE_STATE_MIGRATION_SETUP,
+ VFIO_DEVICE_STATE_MIGRATION_PRECOPY,
+ VFIO_DEVICE_STATE_MIGRATION_STOPNCOPY,
+ VFIO_DEVICE_STATE_MIGRATION_SAVE_COMPLETED,
+ VFIO_DEVICE_STATE_MIGRATION_RESUME,
+ VFIO_DEVICE_STATE_MIGRATION_RESUME_COMPLETED,
+ VFIO_DEVICE_STATE_MIGRATION_FAILED,
+ VFIO_DEVICE_STATE_MIGRATION_CANCELLED,
+};
+
+/**
+ * Structure vfio_device_migration_info is placed at 0th offset of
+ * VFIO_REGION_SUBTYPE_MIGRATION region to get/set VFIO device related
migration
+ * information.
+ *
+ * Action Set state:
+ * To tell vendor driver the state VFIO device should be transitioned to.
+ * device_state [input] : User space app sends device state to vendor
+ * driver on state change, the state to which VFIO device should be
+ * transitioned to.
+ *
+ * Action Get pending bytes:
+ * To get pending bytes yet to be migrated from vendor driver
+ * pending.threshold_size [Input] : threshold of buffer in User space app.
+ * pending.precopy_only [output] : pending data which must be migrated in
+ * precopy phase or in stopped state, in other words - before target
+ * user space application or VM start. In case of migration, this
+ * indicates pending bytes to be transfered while application or VM or
+ * vCPUs are active and running.
+ * pending.compatible [output] : pending data which may be migrated any
+ * time , either when application or VM is active and vCPUs are active
+ * or when application or VM is halted and vCPUs are halted.
+ * pending.postcopy_only [output] : pending data which must be migrated in
+ * postcopy phase or in stopped state, in other words - after source
+ * application or VM stopped and vCPUs are halted.
+ * Sum of pending.precopy_only, pending.compatible and
+ * pending.postcopy_only is the whole amount of pending data.
+ *
+ * Action Get buffer:
+ * On this action, vendor driver should write data to migration region and
+ * return number of bytes written in the region.
+ * data.offset [output] : offset in the region from where data is written.
+ * data.size [output] : number of bytes written in migration buffer by
+ * vendor driver.
+ *
+ * Action Set buffer:
+ * In migration resume path, user space app writes to migration region and
+ * communicates it to vendor driver with this action.
+ * data.offset [Input] : offset in the region from where data is written.
+ * data.size [Input] : number of bytes written in migration buffer by
+ * user space app.
+ *
+ * Action Get dirty pages bitmap:
+ * Get bitmap of dirty pages from vendor driver from given start address.
+ * dirty_pfns.start_addr [Input] : start address
+ * dirty_pfns.total [Input] : Total pfn count from start_addr for which
+ * dirty bitmap is requested
+ * dirty_pfns.copied [Output] : pfn count for which dirty bitmap is copied
+ * to migration region.
+ * Vendor driver should copy the bitmap with bits set only for pages to be
+ * marked dirty in migration region.
+ */
+
Hi Kirti,
I am very interested in your work, thanks for it.
I just begin to look at it.
+struct vfio_device_migration_info {
+ __u32 device_state; /* VFIO device state */
May be it is a little soon to care about this but wouldn't the __u32
here cause a problem, even with packed (or due to packed), for different
architectures?
Wouldn't it be better to use a __u64 for the state and keep all
naturally aligned?
Regards,
Pierre
+ struct {
+ __u64 precopy_only;
+ __u64 compatible;
+ __u64 postcopy_only;
+ __u64 threshold_size;
+ } pending;
+ struct {
+ __u64 offset; /* offset */
+ __u64 size; /* size */
+ } data;
+ struct {
+ __u64 start_addr;
+ __u64 total;
+ __u64 copied;
+ } dirty_pfns;
+} __attribute__((packed));
+
/* -------- API for Type1 VFIO IOMMU -------- */
/**
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany