On 2/20/22 06:44, Alexander Kozyrev wrote:
A new, faster, queue-based flow rules management mechanism is needed for
applications offloading rules inside the datapath. This asynchronous
and lockless mechanism frees the CPU for further packet processing and
reduces the performance impact of the flow rules creation/destruction
on the datapath. Note that queues are not thread-safe and the queue
should be accessed from the same thread for all queue operations.
It is the responsibility of the app to sync the queue functions in case
of multi-threaded access to the same queue.
The rte_flow_async_create() function enqueues a flow creation to the
requested queue. It benefits from already configured resources and sets
unique values on top of item and action templates. A flow rule is enqueued
on the specified flow queue and offloaded asynchronously to the hardware.
The function returns immediately to spare CPU for further packet
processing. The application must invoke the rte_flow_pull() function
to complete the flow rule operation offloading, to clear the queue, and to
receive the operation status. The rte_flow_async_destroy() function
enqueues a flow destruction to the requested queue.
Signed-off-by: Alexander Kozyrev <akozy...@nvidia.com>
Acked-by: Ori Kam <or...@nvidia.com>
[snip]
@@ -3777,6 +3782,125 @@ and pattern and actions templates are created.
&actions_templates, nb_actions_templ,
&error);
+Asynchronous operations
+-----------------------
+
+Flow rules management can be done via special lockless flow management queues.
+- Queue operations are asynchronous and not thread-safe.
+
+- Operations can thus be invoked by the app's datapath,
+ packet processing can continue while queue operations are processed by NIC.
+
+- Number of flow queues is configured at initialization stage.
+
+- Available operation types: rule creation, rule destruction,
+ indirect rule creation, indirect rule destruction, indirect rule update.
+
+- Operations may be reordered within a queue.
+
+- Operations can be postponed and pushed to NIC in batches.
+
+- Results pulling must be done on time to avoid queue overflows.
I guess the documenation is for applications, but IMHO it is a
driver responsiblity. Application should not care about it.
Yes, applicatoin should do pulling, but it should not think
about overflow. Request should be rejected if there is no space
in queue.
+
+- User data is returned as part of the result to identify an operation.
Also "User data should uniquelly identify request (may be except corner
case when only one request is enqueued at most)."
+
+- Flow handle is valid once the creation operation is enqueued and must be
+ destroyed even if the operation is not successful and the rule is not
inserted.
+
+- Application must wait for the creation operation result before enqueueing
+ the deletion operation to make sure the creation is processed by NIC.
+
[snip]
+The asynchronous flow rule insertion logic can be broken into two phases.
+
+1. Initialization stage as shown here:
+
+.. _figure_rte_flow_async_init:
+
+.. figure:: img/rte_flow_async_init.*
+
+2. Main loop as presented on a datapath application example:
+
+.. _figure_rte_flow_async_usage:
+
+.. figure:: img/rte_flow_async_usage.*
+
+Enqueue creation operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Enqueueing a flow rule creation operation is similar to simple creation.
+
+.. code-block:: c
+
+ struct rte_flow *
+ rte_flow_async_create(uint16_t port_id,
+ uint32_t queue_id,
+ const struct rte_flow_q_ops_attr *q_ops_attr,
May be rte_flow_async_ops_attr *attr?
+ struct rte_flow_template_table *template_table,
+ const struct rte_flow_item pattern[],
+ uint8_t pattern_template_index,
+ const struct rte_flow_action actions[],
+ uint8_t actions_template_index,
+ void *user_data,
+ struct rte_flow_error *error);
+
+A valid handle in case of success is returned. It must be destroyed later
+by calling ``rte_flow_async_destroy()`` even if the rule is rejected by HW.
[snip]
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index e9f684eedb..4e7b202522 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -1396,6 +1396,7 @@ rte_flow_flex_item_release(uint16_t port_id,
int
rte_flow_info_get(uint16_t port_id,
struct rte_flow_port_info *port_info,
+ struct rte_flow_queue_info *queue_info,
It should be either optional (update description) or sanity
checked vs NULL below (similar to port_info).
struct rte_flow_error *error)
{
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
@@ -1415,7 +1416,7 @@ rte_flow_info_get(uint16_t port_id,
return -rte_errno;
if (likely(!!ops->info_get)) {
return flow_err(port_id,
- ops->info_get(dev, port_info, error),
+ ops->info_get(dev, port_info, queue_info,
error),
error);
}
return rte_flow_error_set(error, ENOTSUP,
@@ -1426,6 +1427,8 @@ rte_flow_info_get(uint16_t port_id,
int
rte_flow_configure(uint16_t port_id,
const struct rte_flow_port_attr *port_attr,
+ uint16_t nb_queue,
+ const struct rte_flow_queue_attr *queue_attr[],
Is it really an array of pointers? If yes, why?
struct rte_flow_error *error)
{
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
@@ -1433,7 +1436,7 @@ rte_flow_configure(uint16_t port_id,
int ret;
dev->data->flow_configured = 0;
- if (port_attr == NULL) {
+ if (port_attr == NULL || queue_attr == NULL) {
RTE_FLOW_LOG(ERR, "Port %"PRIu16" info is NULL.\n", port_id);
Log message becomes misleading
[snip]
return -EINVAL;
}
@@ -1452,7 +1455,7 @@ rte_flow_configure(uint16_t port_id,
if (unlikely(!ops))
return -rte_errno;
if (likely(!!ops->configure)) {
- ret = ops->configure(dev, port_attr, error);
+ ret = ops->configure(dev, port_attr, nb_queue, queue_attr,
error);
if (ret == 0)
dev->data->flow_configured = 1;
return flow_err(port_id, ret, error);
@@ -1713,3 +1716,104 @@ rte_flow_template_table_destroy(uint16_t port_id,
RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
NULL, rte_strerror(ENOTSUP));
}
+
+struct rte_flow *
+rte_flow_async_create(uint16_t port_id,
+ uint32_t queue_id,
+ const struct rte_flow_q_ops_attr *q_ops_attr,
+ struct rte_flow_template_table *template_table,
+ const struct rte_flow_item pattern[],
+ uint8_t pattern_template_index,
+ const struct rte_flow_action actions[],
+ uint8_t actions_template_index,
+ void *user_data,
+ struct rte_flow_error *error)
+{
+ struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+ const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+ struct rte_flow *flow;
+
+ if (unlikely(!ops))
+ return NULL;
+ if (likely(!!ops->async_create)) {
Hm, we should make a consistent decision. If it is super-
critical fast path - we should have no sanity checks at all.
If no, we should have all simple sanity checks. Otherwise,
I don't understand why we do some checks and ignore another.
[snip]
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 776e8ccc11..9e71a576f6 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -4884,6 +4884,10 @@ rte_flow_flex_item_release(uint16_t port_id,
*
*/
struct rte_flow_port_info {
+ /**
+ * Maximum umber of queues for asynchronous operations.
umber -> number
[snip]