> -----Original Message-----
> From: Stephen Hemminger <step...@networkplumber.org>
> Sent: Thursday, December 28, 2023 18:17
> > However, at the moment I see one problem with this approach.
> > It would require DPDK to expose the rte_eth_dev struct definition,
> > because of implied locking implemented in the flow API.
> 
> This is a blocker, showstopper for me.
+1

> Have you considered having something like
>    rte_flow_create_bulk()
> 
> or better yet a Linux iouring style API?
> 
> A ring style API would allow for better mixed operations across the board and
> get rid of the I-cache overhead which is the root cause of the needing inline.
Existing async flow API is somewhat close to the io_uring interface.
The difference being that queue is not directly exposed to the application.
Application interacts with the queue using rte_flow_async_* APIs (e.g., places 
operations in the queue, pushes them to the HW).
Such design has some benefits over a flow API which exposes the queue to the 
user:
- Easier to use - Applications do not manage the queue directly, they do it 
through exposed APIs.
- Consistent with other DPDK APIs - In other libraries, queues are manipulated 
through API, not directly by an application.
- Lower memory usage - only HW primitives are needed (e.g., HW queue on PMD 
side), no need to allocate separate application queues.

Bulking of flow operations is a tricky subject.
Compared to packet processing, where it is desired to keep the manipulation of 
raw packet data to the minimum (e.g., only packet headers are accessed),
during flow rule creation all items and actions must be processed by PMD to 
create a flow rule.
The amount of memory consumed by items and actions themselves during this 
process might be nonnegligible.
If flow rule operations were bulked, the size of working set of memory would 
increase, which could have negative consequences on the cache behavior.
So, it might be the case that by utilizing bulking the I-cache overhead is 
removed, but the D-cache overhead is added.
On the other hand, creating flow rule operations (or enqueuing flow rule 
operations) one by one enables applications to reuse the same memory for 
different flow rules.

In summary, in my opinion extending the async flow API with bulking 
capabilities or exposing the queue directly to the application is not desirable.
This proposal aims to reduce the I-cache overhead in async flow API by reusing 
the existing design pattern in DPDK - fast path functions are inlined to the 
application code and they call cached PMD callbacks.

Best regards,
Dariusz Sosnowski

Reply via email to