On 1/18/23 18:44, Rongwei Liu wrote:
When a DPDK application must be upgraded,
the traffic downtime should be shortened as much as possible.
During the migration time, the old application may stay alive
while the new application is starting and being configured.
In order to optimize the switch to the new application,
the old application may need to be aware of the presence
of the new application being prepared.
This is achieved with a new API allowing the user to change the
new application state to standby and active later.
The added function is trying to apply the new state to all probed
ethdev ports. To make this API simple and easy to use,
the same flags have to be accepted by all devices.
This is the scenario of operations in the old and new applications:
. device: already configured by the old application
. new: start as active
. new: probe the same device
. new: set as standby
. new: configure the device
. device: has configurations from old and new applications
. old: clear its device configuration
. device: has only 1 configuration from new application
. new: set as active
. device: downtime for connecting all to the new application
. old: shutdown
The active role means network handling configurations are programmed
to the HW immediately, and no behavior changed. This is the default state.
The standby role means configurations are queued in the HW.
If there is no application with active role,
any configuration is effective immediately.
Signed-off-by: Rongwei Liu <rongw...@nvidia.com>
---
doc/guides/rel_notes/release_23_03.rst | 7 ++++
lib/ethdev/ethdev_driver.h | 20 +++++++++
lib/ethdev/rte_ethdev.c | 42 +++++++++++++++++++
lib/ethdev/rte_ethdev.h | 56 ++++++++++++++++++++++++++
lib/ethdev/version.map | 3 ++
5 files changed, 128 insertions(+)
diff --git a/doc/guides/rel_notes/release_23_03.rst
b/doc/guides/rel_notes/release_23_03.rst
index b8c5b68d6c..5367123f24 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -55,6 +55,13 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added process state in ethdev to improve live migration.**
+
+ Hot upgrade of an application may be accelerated by configuring
+ the new application in standby state while the old one is still active.
+ Such double ethdev configuration of the same device is possible
+ with the added process state API.
+
Will we have the same API for other device classes? Any ideas
how to avoid duplication? (I'm afraid we don't have generic
place to put such functionality).
Removed Items
-------------
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 6a550cfc83..4a098410d5 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -219,6 +219,23 @@ typedef int (*eth_dev_reset_t)(struct rte_eth_dev *dev);
/** @internal Function used to detect an Ethernet device removal. */
typedef int (*eth_is_removed_t)(struct rte_eth_dev *dev);
+/**
+ * @internal
+ * Set the role of the process to active or standby during live migration.
+ *
+ * @param dev
+ * Port (ethdev) handle.
+ * @param standby
+ * Role active if false, standby if true.
+ * @param flags
+ * Role specific flags.
+ *
+ * @return
+ * Negative value on error, 0 on success.
+ */
+typedef int (*eth_dev_process_set_role_t)(struct rte_eth_dev *dev,
+ bool standby, uint32_t flags);
+
/**
* @internal
* Function used to enable the Rx promiscuous mode of an Ethernet device.
@@ -1186,6 +1203,9 @@ struct eth_dev_ops {
/** Check if the device was physically removed */
eth_is_removed_t is_removed;
+ /** Set role during live migration */
+ eth_dev_process_set_role_t process_set_role;
+
eth_promiscuous_enable_t promiscuous_enable; /**< Promiscuous ON */
eth_promiscuous_disable_t promiscuous_disable;/**< Promiscuous OFF */
eth_allmulticast_enable_t allmulticast_enable;/**< Rx multicast ON */
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 5d5e18db1e..3a1fb64053 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -558,6 +558,48 @@ rte_eth_dev_owner_get(const uint16_t port_id, struct
rte_eth_dev_owner *owner)
return 0;
}
+int
+rte_eth_process_set_role(bool standby, uint32_t flags)
+{
+ struct rte_eth_dev_info dev_info = {0};
+ struct rte_eth_dev *dev;
+ uint16_t port_id;
+ int ret = 0;
+
+ /* Check if all devices support process role. */
+ RTE_ETH_FOREACH_DEV(port_id) {
+ dev = &rte_eth_devices[port_id];
+ if (*dev->dev_ops->process_set_role != NULL &&
+ *dev->dev_ops->dev_infos_get != NULL &&
+ (*dev->dev_ops->dev_infos_get)(dev, &dev_info) == 0 &&
+ (dev_info.dev_capa & RTE_ETH_DEV_CAPA_PROCESS_ROLE) !=
0)
Why do we need both non-NULL callback and dev_capa flag?
Isn't just non-NULL callback sufficient?
+ continue;
+ rte_errno = ENOTSUP;
+ return -rte_errno;
+ }
+ /* Call the driver callbacks. */
+ RTE_ETH_FOREACH_DEV(port_id) {
+ dev = &rte_eth_devices[port_id];
+ if ((*dev->dev_ops->process_set_role)(dev, standby, flags) < 0)
Please, add error logging on failure.
+ goto failure;
IMHO it would be useful to have info logs on success.
+ ret++;
ret variable name is very confusing, please, be
more specific in variable naming.
+ }
+ return ret;
+
+failure:
+ /* Rollback all changed devices in case one failed. */
+ if (ret) {
Compare vs 0
+ RTE_ETH_FOREACH_DEV(port_id) {
+ dev = &rte_eth_devices[port_id];
+ (*dev->dev_ops->process_set_role)(dev, !standby, flags);
IMHO, it would be useful to have error logs on rollback
failure and info logs on success.
+ if (--ret == 0)
+ break;
+ }
+ }
+ rte_errno = EPERM;
Why is it always EPERM? Isn't it better to forward
failure code from driver?
+ return -rte_errno;
+}
+
int
rte_eth_dev_socket_id(uint16_t port_id)
{
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index c129ca1eaf..1505396ced 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1606,6 +1606,8 @@ struct rte_eth_conf {
#define RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP RTE_BIT64(3)
/** Device supports keeping shared flow objects across restart. */
#define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
+/** Device supports process role changing. @see rte_eth_process_set_active */
There is no rte_eth_process_set_active()
+#define RTE_ETH_DEV_CAPA_PROCESS_ROLE RTE_BIT64(5)
/**@}*/
/*
@@ -2204,6 +2206,60 @@ int rte_eth_dev_owner_delete(const uint64_t owner_id);
int rte_eth_dev_owner_get(const uint16_t port_id,
struct rte_eth_dev_owner *owner);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Set the role of the process to active or standby,
+ * affecting network traffic handling.
+ *
+ * If one device does not support this operation or fails,
+ * the whole operation is failed and rolled back.
+ *
+ * It is forbidden to have multiple processes with the same role
+ * unless only one of them is configured to handle the traffic.
Why is it not allowed to have many processes in standby?
+ *
+ * The application is active by default.
+ * The configuration from the active process is effective immediately
+ * while the configuration from the standby process is queued by hardware.
I'm afraid error handling could be tricky in this case.
I think it makes sense to highlight that configuration
propagation failures will be returned on attempt to set
the process active.
+ * When configuring the device from a standby process,
+ * it has no effect except for below situations:
+ * - traffic not handled by the active process configuration
+ * - no active process
+ *
+ * When a process is changed from a standby to an active role,
+ * all preceding configurations that are queued by hardware
+ * should become effective immediately.
+ * Before role transition, all the traffic handling configurations
+ * set by the active process should be flushed first.
+ *
+ * In summary, the operations are expected to happen in this order
+ * in "old" and "new" applications:
+ * device: already configured by the old application
+ * new: start as active
+ * new: probe the same device
+ * new: set as standby
+ * new: configure the device
+ * device: has configurations from old and new applications
+ * old: clear its device configuration
+ * device: has only 1 configuration from new application
+ * new: set as active
+ * device: downtime for connecting all to the new application
+ * old: shutdown
+ *
+ * @param standby
+ * Role active if false, standby if true.
Typically API with boolean parameters is bad. May be in this
particular case it is better to have two functions:
rte_eth_process_set_active() and rte_eth_process_set_standby().
+ * @param flags
+ * Role specific flags.
Please, define namespace for flags.
+ * @return
+ * Positive value on success, -rte_errno value on error:
+ * - (> 0) Number of switched devices.
Isn't is success if there is no devices to switch?
+ * - (-ENOTSUP) if not supported by a device.
+ * - (-EPERM) if operation failed with a device.
+ */
+__rte_experimental
+int rte_eth_process_set_role(bool standby, uint32_t flags);
+
/**
* Get the number of ports which are usable for the application.
*
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 17201fbe0f..d5d3ea5421 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -298,6 +298,9 @@ EXPERIMENTAL {
rte_flow_get_q_aged_flows;
rte_mtr_meter_policy_get;
rte_mtr_meter_profile_get;
+
+ # added in 23.03
+ rte_eth_process_set_role;
};
INTERNAL {