On 2020-04-14 18:15, Jerin Jacob wrote: > On Tue, Apr 14, 2020 at 9:27 PM Mattias Rönnblom > <mattias.ronnb...@ericsson.com> wrote: >> On 2020-04-10 15:00, Jerin Jacob wrote: >>> On Thu, Apr 9, 2020 at 7:32 PM Mattias Rönnblom >>> <mattias.ronnb...@ericsson.com> wrote: >>>> On 2020-04-09 15:32, Jerin Jacob wrote: >>>>> On Thu, Apr 9, 2020 at 5:51 PM Mattias Rönnblom >>>>> <mattias.ronnb...@ericsson.com> wrote: >>>>>> On 2020-04-08 21:36, Jerin Jacob wrote: >>>>>>> On Wed, Apr 8, 2020 at 11:27 PM Mattias Rönnblom >>>>>>> <mattias.ronnb...@ericsson.com> wrote: >>>>>>>> Extend Eventdev API to allow for event devices which require various >>>>>>>> forms of internal processing to happen, even when events are not >>>>>>>> enqueued to or dequeued from a port. >>>>>>>> >>>>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> >>>>>>>> --- >>>>>>>> lib/librte_eventdev/rte_eventdev.h | 65 >>>>>>>> ++++++++++++++++++++++++++ >>>>>>>> lib/librte_eventdev/rte_eventdev_pmd.h | 14 ++++++ >>>>>>>> 2 files changed, 79 insertions(+) >>>>>>>> >>>>>>>> diff --git a/lib/librte_eventdev/rte_eventdev.h >>>>>>>> b/lib/librte_eventdev/rte_eventdev.h >>>>>>>> index 226f352ad..d69150792 100644 >>>>>>>> --- a/lib/librte_eventdev/rte_eventdev.h >>>>>>>> +++ b/lib/librte_eventdev/rte_eventdev.h >>>>>>>> @@ -289,6 +289,15 @@ struct rte_event; >>>>>>>> * single queue to each port or map a single queue to many port. >>>>>>>> */ >>>>>>>> >>>>>>>> +#define RTE_EVENT_DEV_CAP_REQUIRES_MAINT (1ULL << 9) >>>>>>>> +/**< Event device requires calls to rte_event_maintain() during >>>>>>> This scheme would call for DSW specific API handling in fastpath. >>>>>> Initially this would be so, but buffering events might yield performance >>>>>> benefits for more event devices than DSW. >>>>>> >>>>>> >>>>>> In an application, it's often convenient, but sub-optimal from a >>>>>> performance point of view, to do single-event enqueue operations. The >>>>>> alternative is to use an application-level buffer, and the flush this >>>>>> buffer with rte_event_enqueue_burst(). If you allow the event device to >>>>>> buffer, you get the simplicity of single-event enqueue operations, but >>>>>> without taking any noticeable performance hit. >>>>> IMO, It is better to aggregate the burst by the application, as sending >>>>> event by event to the driver to aggregate has performance due to cost >>>>> function pointer overhead. >>>> That's a very slight overhead - but for optimal performance, sure. It'll >>>> come at a cost in terms of code complexity. Just look at the adapters. >>>> They do this already. I think some applications are ready to take the >>>> extra 5-10 clock cycles or so it'll cost them to do the function call >>>> (provided the event device had buffering support). >>> So Is there any advantage of moving aggregation logic to PMD? it is costly. >> >> What do you mean by aggregation logic? > aggregation == buffering.
The reason I put it into DSW to begin with was that it yielded a significant performance benefit, for situations where the application would enqueue() few or even single events per call. For DSW it will translate to lower DPDK event ring overhead. I would imagine it could improve performance for hardware-based event devices as well. >> >>>>> Another concern is the frequency of calling rte_event_maintain() function >>>>> by >>>>> the application, as the timing requirements will vary differently by >>>>> the driver to driver and application to application. >>>>> IMO, It is not portable and I believe the application should not be >>>>> aware of those details. If the driver needs specific maintenance >>>>> function for any other reason then better to use DPDK SERVICE core infra. >>>> The only thing the application needs to be aware of, is that it needs to >>>> call rte_event_maintain() as often as it would have called dequeue() in >>>> your "typical worker" example. To make sure this call is cheap-enough is >>>> up to the driver, and this needs to hold true for all event devices that >>>> needs maintenance. >>> Why not rte_event_maintain() can't do either in dequeue() or enqueue() >>> in the driver context? Either one of them has to be called >>> periodically by application >>> in any case? >> >> No, producer-only ports can go idle for long times. For applications >> that don't "go idle" need not worry about the maintain function. > If I understand it correctly, If the worker does not call enqueue() or > dequeue() > for a long time then maintain() needs to be called by the application. > > That's where I concern with this API. What is the definition of long > time frame?(ns or ticks?) > Will it be changing driver to driver and arch to arch? And it is > leaking the driver requirements to the application. > It's difficult to quantify exactly, but the rate should be on the same order of magnitude as you would call dequeue() on a consumer-type worker port. All the RTE_EVENT_DEV_CAP_* are essentially represents such leakage, where the event device driver and/or the underlying hardware express various capabilities and limitations. I'm not sure it needs to be much more complicated for the application to handle than the change to the event adapters I included in the patch. There, it boils down the service function call rate, which would be high during low load (causing buffers to be flush quickly etc), and a little lower during high lcore load. >> >>>> If you plan to use a non-buffering hardware device driver or a soft, >>>> centralized scheduler that doesn't need this, it will also not set the >>>> flag, and thus the application needs not care about the >>>> rte_event_maintain() function. DPDK code such as the eventdev adapters >>>> do need to care, but the increase in complexity is slight, and the cost >>>> of calling rte_maintain_event() on a maintenance-free devices is very >>>> low (since the then-NULL function pointer is in the eventdev struct, >>>> likely on a cache-line already dragged in). >>>> >>>> >>>> Unfortunately, DPDK doesn't have a per-core delayed-work mechanism. >>>> Flushing event buffers (and other DSW "background work") can't be done >>>> on a service core, since they would work on non-MT-safe data structures >>>> on the worker thread's event ports. >>> Yes. Otherwise, DSW needs to update to support MT safe. >> >> I haven't been looking at this in detail, but I suspect it will be both >> complex and not very performant. One of problems that need to be solved >> in such a solution, is the "pausing" of flows during migration. All >> participating lcores needs to ACK that a flow is paused. > Could you share the patch on the details on how much it costs? I don't have a ready-made solution for making lcore ports thread-safe. > >> >>>>>>>> + * periods when neither rte_event_dequeue_burst() nor >>>>>>> The typical worker thread will be >>>>>>> while (1) { >>>>>>> rte_event_dequeue_burst(); >>>>>>> ..proess.. >>>>>>> rte_event_enqueue_burst(); >>>>>>> } >>>>>>> If so, Why DSW driver can't do the maintenance in driver context in >>>>>>> dequeue() call. >>>>>>> >>>>>> DSW already does maintenance on dequeue, and works well in the above >>>>>> scenario. The typical worker does not need to care about the >>>>>> rte_event_maintain() functions, since it dequeues events on a regular >>>>>> basis. >>>>>> >>>>>> >>>>>> What this RFC addresses is the more atypical (but still fairly common) >>>>>> case of a port being neither dequeued to or enqueued from on a regular >>>>>> basis. The timer and ethernet rx adapters are examples of such. >>>>> If it is an Adapter specific use case problem then maybe, we have >>>>> an option to fix the problem in adapter specific API usage or in that >>>>> area. >>>>> >>>> It's not adapter specific, I think. There might be producer-only ports, >>>> for example, which doesn't provide a constant stream of events, but >>>> rather intermittent bursts. A traffic generator is one example of such >>>> an application, and there might be other, less synthetic ones as well. >>> In that case, the application knows the purpose of the eventdev port. >>> Is changing eventdev spec to configure "port" or "queue" for that use >>> case help? Meaning, DSW or >>> Any driver can get the hint and change the function pointers >>> accordingly for fastpath. >>> For instance, do maintenance on enqueue() for such ports or so. >> >> This is what DSW does already today. A dequeue() call with a zero-length >> event array serves the purpose of rte_event_maintain(). It's a bit of a >> hack, in my opinion. > I agree that it is the hack. > > One more concern we have we are adding API for the new driver requirements and > not updating the example application. Sharing the example application > patch would > enable us to understand the cost and usage.(i.e Cost wrt to other SW drivers) Good point. > >> >>>>>>>> + * rte_event_enqueue_burst() are called on a port. This will allow the >>>>>>>> + * event device to perform internal processing, such as flushing >>>>>>>> + * buffered events, return credits to a global pool, or process >>>>>>>> + * signaling related to load balancing. >>>>>>>> + */ >>