On 2020-04-09 15:32, Jerin Jacob wrote:
> On Thu, Apr 9, 2020 at 5:51 PM Mattias Rönnblom
> <mattias.ronnb...@ericsson.com> wrote:
>> On 2020-04-08 21:36, Jerin Jacob wrote:
>>> On Wed, Apr 8, 2020 at 11:27 PM Mattias Rönnblom
>>> <mattias.ronnb...@ericsson.com> wrote:
>>>> Extend Eventdev API to allow for event devices which require various
>>>> forms of internal processing to happen, even when events are not
>>>> enqueued to or dequeued from a port.
>>>>
>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>>>> ---
>>>>    lib/librte_eventdev/rte_eventdev.h     | 65 ++++++++++++++++++++++++++
>>>>    lib/librte_eventdev/rte_eventdev_pmd.h | 14 ++++++
>>>>    2 files changed, 79 insertions(+)
>>>>
>>>> diff --git a/lib/librte_eventdev/rte_eventdev.h 
>>>> b/lib/librte_eventdev/rte_eventdev.h
>>>> index 226f352ad..d69150792 100644
>>>> --- a/lib/librte_eventdev/rte_eventdev.h
>>>> +++ b/lib/librte_eventdev/rte_eventdev.h
>>>> @@ -289,6 +289,15 @@ struct rte_event;
>>>>     * single queue to each port or map a single queue to many port.
>>>>     */
>>>>
>>>> +#define RTE_EVENT_DEV_CAP_REQUIRES_MAINT (1ULL << 9)
>>>> +/**< Event device requires calls to rte_event_maintain() during
>>> This scheme would call for DSW specific API handling in fastpath.
>>
>> Initially this would be so, but buffering events might yield performance
>> benefits for more event devices than DSW.
>>
>>
>> In an application, it's often convenient, but sub-optimal from a
>> performance point of view, to do single-event enqueue operations. The
>> alternative is to use an application-level buffer, and the flush this
>> buffer with rte_event_enqueue_burst(). If you allow the event device to
>> buffer, you get the simplicity of single-event enqueue operations, but
>> without taking any noticeable performance hit.
> IMO, It is better to aggregate the burst by the application,  as sending
> event by event to the driver to aggregate has performance due to cost
> function pointer overhead.


That's a very slight overhead - but for optimal performance, sure. It'll 
come at a cost in terms of code complexity. Just look at the adapters. 
They do this already. I think some applications are ready to take the 
extra 5-10 clock cycles or so it'll cost them to do the function call 
(provided the event device had buffering support).


> Another concern is the frequency of calling rte_event_maintain() function by
> the application, as the timing requirements will vary differently by
> the driver to driver and application to application.
> IMO, It is not portable and I believe the application should not be
> aware of those details. If the driver needs specific maintenance
> function for any other reason then better to use DPDK SERVICE core infra.


The only thing the application needs to be aware of, is that it needs to 
call rte_event_maintain() as often as it would have called dequeue() in 
your "typical worker" example. To make sure this call is cheap-enough is 
up to the driver, and this needs to hold true for all event devices that 
needs maintenance.


If you plan to use a non-buffering hardware device driver or a soft, 
centralized scheduler that doesn't need this, it will also not set the 
flag, and thus the application needs not care about the 
rte_event_maintain() function. DPDK code such as the eventdev adapters 
do need to care, but the increase in complexity is slight, and the cost 
of calling rte_maintain_event() on a maintenance-free devices is very 
low (since the then-NULL function pointer is in the eventdev struct, 
likely on a cache-line already dragged in).


Unfortunately, DPDK doesn't have a per-core delayed-work mechanism. 
Flushing event buffers (and other DSW "background work") can't be done 
on a service core, since they would work on non-MT-safe data structures 
on the worker thread's event ports.


>>
>>>> + * periods when neither rte_event_dequeue_burst() nor
>>> The typical worker thread will be
>>> while (1) {
>>>                   rte_event_dequeue_burst();
>>>                    ..proess..
>>>                   rte_event_enqueue_burst();
>>> }
>>> If so, Why DSW driver can't do the maintenance in driver context in
>>> dequeue() call.
>>>
>> DSW already does maintenance on dequeue, and works well in the above
>> scenario. The typical worker does not need to care about the
>> rte_event_maintain() functions, since it dequeues events on a regular basis.
>>
>>
>> What this RFC addresses is the more atypical (but still fairly common)
>> case of a port being neither dequeued to or enqueued from on a regular
>> basis. The timer and ethernet rx adapters are examples of such.
> If it is an Adapter specific use case problem then maybe, we have
> an option to fix the problem in adapter specific API usage or in that area.
>

It's not adapter specific, I think. There might be producer-only ports, 
for example, which doesn't provide a constant stream of events, but 
rather intermittent bursts. A traffic generator is one example of such 
an application, and there might be other, less synthetic ones as well.


>>
>>>> + * rte_event_enqueue_burst() are called on a port. This will allow the
>>>> + * event device to perform internal processing, such as flushing
>>>> + * buffered events, return credits to a global pool, or process
>>>> + * signaling related to load balancing.
>>>> + */
>>

Reply via email to