On 2/12/22 05:19, Alexander Kozyrev wrote:
On Fri, Feb 11, 2022 7:42 Andrew Rybchenko <andrew.rybche...@oktetlabs.ru>:
On 2/11/22 05:26, Alexander Kozyrev wrote:
A new, faster, queue-based flow rules management mechanism is needed
for
applications offloading rules inside the datapath. This asynchronous
and lockless mechanism frees the CPU for further packet processing and
reduces the performance impact of the flow rules creation/destruction
on the datapath. Note that queues are not thread-safe and the queue
should be accessed from the same thread for all queue operations.
It is the responsibility of the app to sync the queue functions in case
of multi-threaded access to the same queue.

The rte_flow_q_flow_create() function enqueues a flow creation to the
requested queue. It benefits from already configured resources and sets
unique values on top of item and action templates. A flow rule is enqueued
on the specified flow queue and offloaded asynchronously to the
hardware.
The function returns immediately to spare CPU for further packet
processing. The application must invoke the rte_flow_q_pull() function
to complete the flow rule operation offloading, to clear the queue, and to
receive the operation status. The rte_flow_q_flow_destroy() function
enqueues a flow destruction to the requested queue.

Signed-off-by: Alexander Kozyrev <akozy...@nvidia.com>
Acked-by: Ori Kam <or...@nvidia.com>

[snip]

+
+- Available operation types: rule creation, rule destruction,
+  indirect rule creation, indirect rule destruction, indirect rule update.
+
+- Operations may be reordered within a queue.

Do we want to have barriers?
E.g. create rule, destroy the same rule -> reoder -> destroy fails, rule
lives forever.

API design is crafter with the throughput as the main goal in mind.
We allow user to enforce any ordering outside these functions.
Another point that not all PMDs/NIC will have this out-of-order execution.

Throughput is nice, but there more important requirements
which must be satistied before talking about performance.
Could you explain me what I should do based on which
information from NIC in order to solve above problem?

+
+- Operations can be postponed and pushed to NIC in batches.
+
+- Results pulling must be done on time to avoid queue overflows.

polling? (as libc poll() which checks status of file descriptors)
it is not pulling the door to open it :)

poll waits for some event on a file descriptor as it title says.
And then user has to invoke read() to actually get any info from the fd.
The point of our function is to return the result immediately, thus pulling.
We had many names appearing in the thread for these functions.
As we know, naming variables is the second hardest thing in programming.
I wanted this pull for results pulling be a counterpart for the push for
pushing the operations to a NIC. Another idea is pop/push pair, but they are
more like for operations only, not for results.
Having said that I'm at the point of accepting any name here.

I agree that it is hard to choose good naming.
Just want to say that polling is not alway waiting.

poll - check the status of (a device), especially as part of a repeated cycle.

Here we're checking status of flow engine requests and yes,
finally in a repeated cycle.

[snip]

+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Queue operation attributes.
+ */
+struct rte_flow_q_ops_attr {
+       /**
+        * The user data that will be returned on the completion events.
+        */
+       void *user_data;

IMHO it must not be hiddne in attrs. It is a key information
which is used to understand the opration result. It should
be passed separately.

Maybe, on the other hand it is optional and may not be needed by an application.

I don't understand how it is possible. Without it application
don't know fate of its requests.

+        /**
+         * When set, the requested action will not be sent to the HW
immediately.
+         * The application must call the rte_flow_queue_push to actually
send it.

Will the next operation without the attribute set implicitly push it?
Is it mandatory for the driver to respect it? Or is it just a possible
optimization hint?

Yes, it will be pushed with all the operations in a queue once the postpone is 
cleared.
It is not mandatory to respect this bit, PMD can use other optimization 
technics.

Could you clarify it in the description.


+         */
+       uint32_t postpone:1;
+};
[snip]

Reply via email to