Re: [dpdk-dev] [RFC 1/3] eventdev: allow for event devices requiring maintenance

Mattias Rönnblom Tue, 14 Apr 2020 10:55:24 -0700

On 2020-04-14 18:15, Jerin Jacob wrote:
> On Tue, Apr 14, 2020 at 9:27 PM Mattias Rönnblom
> <mattias.ronnb...@ericsson.com> wrote:
>> On 2020-04-10 15:00, Jerin Jacob wrote:
>>> On Thu, Apr 9, 2020 at 7:32 PM Mattias Rönnblom
>>> <mattias.ronnb...@ericsson.com> wrote:
>>>> On 2020-04-09 15:32, Jerin Jacob wrote:
>>>>> On Thu, Apr 9, 2020 at 5:51 PM Mattias Rönnblom
>>>>> <mattias.ronnb...@ericsson.com> wrote:
>>>>>> On 2020-04-08 21:36, Jerin Jacob wrote:
>>>>>>> On Wed, Apr 8, 2020 at 11:27 PM Mattias Rönnblom
>>>>>>> <mattias.ronnb...@ericsson.com> wrote:
>>>>>>>> Extend Eventdev API to allow for event devices which require various
>>>>>>>> forms of internal processing to happen, even when events are not
>>>>>>>> enqueued to or dequeued from a port.
>>>>>>>>
>>>>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>>>>>>>> ---
>>>>>>>>      lib/librte_eventdev/rte_eventdev.h     | 65 
>>>>>>>> ++++++++++++++++++++++++++
>>>>>>>>      lib/librte_eventdev/rte_eventdev_pmd.h | 14 ++++++
>>>>>>>>      2 files changed, 79 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_eventdev/rte_eventdev.h 
>>>>>>>> b/lib/librte_eventdev/rte_eventdev.h
>>>>>>>> index 226f352ad..d69150792 100644
>>>>>>>> --- a/lib/librte_eventdev/rte_eventdev.h
>>>>>>>> +++ b/lib/librte_eventdev/rte_eventdev.h
>>>>>>>> @@ -289,6 +289,15 @@ struct rte_event;
>>>>>>>>       * single queue to each port or map a single queue to many port.
>>>>>>>>       */
>>>>>>>>
>>>>>>>> +#define RTE_EVENT_DEV_CAP_REQUIRES_MAINT (1ULL << 9)
>>>>>>>> +/**< Event device requires calls to rte_event_maintain() during
>>>>>>> This scheme would call for DSW specific API handling in fastpath.
>>>>>> Initially this would be so, but buffering events might yield performance
>>>>>> benefits for more event devices than DSW.
>>>>>>
>>>>>>
>>>>>> In an application, it's often convenient, but sub-optimal from a
>>>>>> performance point of view, to do single-event enqueue operations. The
>>>>>> alternative is to use an application-level buffer, and the flush this
>>>>>> buffer with rte_event_enqueue_burst(). If you allow the event device to
>>>>>> buffer, you get the simplicity of single-event enqueue operations, but
>>>>>> without taking any noticeable performance hit.
>>>>> IMO, It is better to aggregate the burst by the application,  as sending
>>>>> event by event to the driver to aggregate has performance due to cost
>>>>> function pointer overhead.
>>>> That's a very slight overhead - but for optimal performance, sure. It'll
>>>> come at a cost in terms of code complexity. Just look at the adapters.
>>>> They do this already. I think some applications are ready to take the
>>>> extra 5-10 clock cycles or so it'll cost them to do the function call
>>>> (provided the event device had buffering support).
>>> So Is there any advantage of moving aggregation logic to PMD? it is costly.
>>
>> What do you mean by aggregation logic?
> aggregation == buffering.



The reason I put it into DSW to begin with was that it yielded a 
significant performance benefit, for situations where the application 
would enqueue() few or even single events per call. For DSW it will 
translate to lower DPDK event ring overhead. I would imagine it could 
improve performance for hardware-based event devices as well.


>>
>>>>> Another concern is the frequency of calling rte_event_maintain() function 
>>>>> by
>>>>> the application, as the timing requirements will vary differently by
>>>>> the driver to driver and application to application.
>>>>> IMO, It is not portable and I believe the application should not be
>>>>> aware of those details. If the driver needs specific maintenance
>>>>> function for any other reason then better to use DPDK SERVICE core infra.
>>>> The only thing the application needs to be aware of, is that it needs to
>>>> call rte_event_maintain() as often as it would have called dequeue() in
>>>> your "typical worker" example. To make sure this call is cheap-enough is
>>>> up to the driver, and this needs to hold true for all event devices that
>>>> needs maintenance.
>>> Why not rte_event_maintain() can't do either in dequeue() or enqueue()
>>> in the driver context? Either one of them has to be called
>>> periodically by application
>>> in any case?
>>
>> No, producer-only ports can go idle for long times. For applications
>> that don't "go idle" need not worry about the maintain function.
> If I understand it correctly, If the worker does not call enqueue() or 
> dequeue()
> for a long time then maintain() needs to be called by the application.
>
> That's where I concern with this API. What is the definition of long
> time frame?(ns or ticks?)
> Will it be changing driver to driver and arch to arch? And it is
> leaking the driver requirements to the application.
>

It's difficult to quantify exactly, but the rate should be on the same 
order of magnitude as you would call dequeue() on a consumer-type worker 
port. All the RTE_EVENT_DEV_CAP_* are essentially represents such 
leakage, where the event device driver and/or the underlying hardware 
express various capabilities and limitations.


I'm not sure it needs to be much more complicated for the application to 
handle than the change to the event adapters I included in the patch. 
There, it boils down the service function call rate, which would be high 
during low load (causing buffers to be flush quickly etc), and a little 
lower during high lcore load.


>>
>>>> If you plan to use a non-buffering hardware device driver or a soft,
>>>> centralized scheduler that doesn't need this, it will also not set the
>>>> flag, and thus the application needs not care about the
>>>> rte_event_maintain() function. DPDK code such as the eventdev adapters
>>>> do need to care, but the increase in complexity is slight, and the cost
>>>> of calling rte_maintain_event() on a maintenance-free devices is very
>>>> low (since the then-NULL function pointer is in the eventdev struct,
>>>> likely on a cache-line already dragged in).
>>>>
>>>>
>>>> Unfortunately, DPDK doesn't have a per-core delayed-work mechanism.
>>>> Flushing event buffers (and other DSW "background work") can't be done
>>>> on a service core, since they would work on non-MT-safe data structures
>>>> on the worker thread's event ports.
>>> Yes. Otherwise, DSW needs to update to support MT safe.
>>
>> I haven't been looking at this in detail, but I suspect it will be both
>> complex and not very performant. One of problems that need to be solved
>> in such a solution, is the "pausing" of flows during migration. All
>> participating lcores needs to ACK that a flow is paused.
> Could you share the patch on the details on how much it costs?


I don't have a ready-made solution for making lcore ports thread-safe.


>
>>
>>>>>>>> + * periods when neither rte_event_dequeue_burst() nor
>>>>>>> The typical worker thread will be
>>>>>>> while (1) {
>>>>>>>                     rte_event_dequeue_burst();
>>>>>>>                      ..proess..
>>>>>>>                     rte_event_enqueue_burst();
>>>>>>> }
>>>>>>> If so, Why DSW driver can't do the maintenance in driver context in
>>>>>>> dequeue() call.
>>>>>>>
>>>>>> DSW already does maintenance on dequeue, and works well in the above
>>>>>> scenario. The typical worker does not need to care about the
>>>>>> rte_event_maintain() functions, since it dequeues events on a regular 
>>>>>> basis.
>>>>>>
>>>>>>
>>>>>> What this RFC addresses is the more atypical (but still fairly common)
>>>>>> case of a port being neither dequeued to or enqueued from on a regular
>>>>>> basis. The timer and ethernet rx adapters are examples of such.
>>>>> If it is an Adapter specific use case problem then maybe, we have
>>>>> an option to fix the problem in adapter specific API usage or in that 
>>>>> area.
>>>>>
>>>> It's not adapter specific, I think. There might be producer-only ports,
>>>> for example, which doesn't provide a constant stream of events, but
>>>> rather intermittent bursts. A traffic generator is one example of such
>>>> an application, and there might be other, less synthetic ones as well.
>>> In that case, the application knows the purpose of the eventdev port.
>>> Is changing eventdev spec to configure "port" or "queue" for that use
>>> case help? Meaning, DSW or
>>> Any driver can get the hint and change the function pointers
>>> accordingly for fastpath.
>>> For instance, do maintenance on enqueue() for such ports or so.
>>
>> This is what DSW does already today. A dequeue() call with a zero-length
>> event array serves the purpose of rte_event_maintain(). It's a bit of a
>> hack, in my opinion.
> I agree that it is the hack.
>
> One more concern we have we are adding API for the new driver requirements and
> not updating the example application. Sharing the example application
> patch would
> enable us to understand the cost and usage.(i.e Cost wrt to other SW drivers)


Good point.



>
>>
>>>>>>>> + * rte_event_enqueue_burst() are called on a port. This will allow the
>>>>>>>> + * event device to perform internal processing, such as flushing
>>>>>>>> + * buffered events, return credits to a global pool, or process
>>>>>>>> + * signaling related to load balancing.
>>>>>>>> + */
>>

Re: [dpdk-dev] [RFC 1/3] eventdev: allow for event devices requiring maintenance

Reply via email to