On Tue, Apr 14, 2020 at 11:25 PM Mattias Rönnblom <mattias.ronnb...@ericsson.com> wrote: > > On 2020-04-14 18:15, Jerin Jacob wrote: > > On Tue, Apr 14, 2020 at 9:27 PM Mattias Rönnblom > > <mattias.ronnb...@ericsson.com> wrote: > >> On 2020-04-10 15:00, Jerin Jacob wrote: > >>> On Thu, Apr 9, 2020 at 7:32 PM Mattias Rönnblom > >>> <mattias.ronnb...@ericsson.com> wrote: > >>>> On 2020-04-09 15:32, Jerin Jacob wrote: > >>>>> On Thu, Apr 9, 2020 at 5:51 PM Mattias Rönnblom > >>>>> <mattias.ronnb...@ericsson.com> wrote: > >>>>>> On 2020-04-08 21:36, Jerin Jacob wrote: > >>>>>>> On Wed, Apr 8, 2020 at 11:27 PM Mattias Rönnblom > >>>>>>> <mattias.ronnb...@ericsson.com> wrote: > >>>>>>>> Extend Eventdev API to allow for event devices which require various > >>>>>>>> forms of internal processing to happen, even when events are not > >>>>>>>> enqueued to or dequeued from a port. > >>>>>>>> > >>>>>>>> Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >>>>>>>> --- > >>>>>>>> lib/librte_eventdev/rte_eventdev.h | 65 > >>>>>>>> ++++++++++++++++++++++++++ > >>>>>>>> lib/librte_eventdev/rte_eventdev_pmd.h | 14 ++++++ > >>>>>>>> 2 files changed, 79 insertions(+) > >>>>>>>> > >>>>>>>> diff --git a/lib/librte_eventdev/rte_eventdev.h > >>>>>>>> b/lib/librte_eventdev/rte_eventdev.h > >>>>>>>> index 226f352ad..d69150792 100644 > >>>>>>>> --- a/lib/librte_eventdev/rte_eventdev.h > >>>>>>>> +++ b/lib/librte_eventdev/rte_eventdev.h > >>>>>>>> @@ -289,6 +289,15 @@ struct rte_event; > >>>>>>>> * single queue to each port or map a single queue to many port. > >>>>>>>> */ > >>>>>>>> > >>>>>>>> +#define RTE_EVENT_DEV_CAP_REQUIRES_MAINT (1ULL << 9) > >>>>>>>> +/**< Event device requires calls to rte_event_maintain() during > >>>>>>> This scheme would call for DSW specific API handling in fastpath. > >>>>>> Initially this would be so, but buffering events might yield > >>>>>> performance > >>>>>> benefits for more event devices than DSW. > >>>>>> > >>>>>> > >>>>>> In an application, it's often convenient, but sub-optimal from a > >>>>>> performance point of view, to do single-event enqueue operations. The > >>>>>> alternative is to use an application-level buffer, and the flush this > >>>>>> buffer with rte_event_enqueue_burst(). If you allow the event device to > >>>>>> buffer, you get the simplicity of single-event enqueue operations, but > >>>>>> without taking any noticeable performance hit. > >>>>> IMO, It is better to aggregate the burst by the application, as sending > >>>>> event by event to the driver to aggregate has performance due to cost > >>>>> function pointer overhead. > >>>> That's a very slight overhead - but for optimal performance, sure. It'll > >>>> come at a cost in terms of code complexity. Just look at the adapters. > >>>> They do this already. I think some applications are ready to take the > >>>> extra 5-10 clock cycles or so it'll cost them to do the function call > >>>> (provided the event device had buffering support). > >>> So Is there any advantage of moving aggregation logic to PMD? it is > >>> costly. > >> > >> What do you mean by aggregation logic? > > aggregation == buffering. > > > The reason I put it into DSW to begin with was that it yielded a > significant performance benefit, for situations where the application > would enqueue() few or even single events per call. For DSW it will > translate to lower DPDK event ring overhead. I would imagine it could > improve performance for hardware-based event devices as well.
Yes. we are already doing this in application. It makes sense for buffering. > > > >> > >>>>> Another concern is the frequency of calling rte_event_maintain() > >>>>> function by > >>>>> the application, as the timing requirements will vary differently by > >>>>> the driver to driver and application to application. > >>>>> IMO, It is not portable and I believe the application should not be > >>>>> aware of those details. If the driver needs specific maintenance > >>>>> function for any other reason then better to use DPDK SERVICE core > >>>>> infra. > >>>> The only thing the application needs to be aware of, is that it needs to > >>>> call rte_event_maintain() as often as it would have called dequeue() in > >>>> your "typical worker" example. To make sure this call is cheap-enough is > >>>> up to the driver, and this needs to hold true for all event devices that > >>>> needs maintenance. > >>> Why not rte_event_maintain() can't do either in dequeue() or enqueue() > >>> in the driver context? Either one of them has to be called > >>> periodically by application > >>> in any case? > >> > >> No, producer-only ports can go idle for long times. For applications > >> that don't "go idle" need not worry about the maintain function. > > If I understand it correctly, If the worker does not call enqueue() or > > dequeue() > > for a long time then maintain() needs to be called by the application. > > > > That's where I concern with this API. What is the definition of long > > time frame?(ns or ticks?) > > Will it be changing driver to driver and arch to arch? And it is > > leaking the driver requirements to the application. > > > > It's difficult to quantify exactly, but the rate should be on the same > order of magnitude as you would call dequeue() on a consumer-type worker The challenge is if another driver has a different requirement for maintain() interms of frequecey if is a public API then how we will abstract? > port. All the RTE_EVENT_DEV_CAP_* are essentially represents such > leakage, where the event device driver and/or the underlying hardware > express various capabilities and limitations. I agree. But in fastpath, we do not bring any such _functional_ difference. If it is on a slow path then no issue at all. > > > I'm not sure it needs to be much more complicated for the application to > handle than the change to the event adapters I included in the patch. > There, it boils down the service function call rate, which would be high > during low load (causing buffers to be flush quickly etc), and a little > lower during high lcore load. > > > >> > >>>> If you plan to use a non-buffering hardware device driver or a soft, > >>>> centralized scheduler that doesn't need this, it will also not set the > >>>> flag, and thus the application needs not care about the > >>>> rte_event_maintain() function. DPDK code such as the eventdev adapters > >>>> do need to care, but the increase in complexity is slight, and the cost > >>>> of calling rte_maintain_event() on a maintenance-free devices is very > >>>> low (since the then-NULL function pointer is in the eventdev struct, > >>>> likely on a cache-line already dragged in). > >>>> > >>>> > >>>> Unfortunately, DPDK doesn't have a per-core delayed-work mechanism. > >>>> Flushing event buffers (and other DSW "background work") can't be done > >>>> on a service core, since they would work on non-MT-safe data structures > >>>> on the worker thread's event ports. > >>> Yes. Otherwise, DSW needs to update to support MT safe. > >> > >> I haven't been looking at this in detail, but I suspect it will be both > >> complex and not very performant. One of problems that need to be solved > >> in such a solution, is the "pausing" of flows during migration. All > >> participating lcores needs to ACK that a flow is paused. > > Could you share the patch on the details on how much it costs? > > > I don't have a ready-made solution for making lcore ports thread-safe. OK > > > > > >> > >>>>>>>> + * periods when neither rte_event_dequeue_burst() nor > >>>>>>> The typical worker thread will be > >>>>>>> while (1) { > >>>>>>> rte_event_dequeue_burst(); > >>>>>>> ..proess.. > >>>>>>> rte_event_enqueue_burst(); > >>>>>>> } > >>>>>>> If so, Why DSW driver can't do the maintenance in driver context in > >>>>>>> dequeue() call. > >>>>>>> > >>>>>> DSW already does maintenance on dequeue, and works well in the above > >>>>>> scenario. The typical worker does not need to care about the > >>>>>> rte_event_maintain() functions, since it dequeues events on a regular > >>>>>> basis. > >>>>>> > >>>>>> > >>>>>> What this RFC addresses is the more atypical (but still fairly common) > >>>>>> case of a port being neither dequeued to or enqueued from on a regular > >>>>>> basis. The timer and ethernet rx adapters are examples of such. > >>>>> If it is an Adapter specific use case problem then maybe, we have > >>>>> an option to fix the problem in adapter specific API usage or in that > >>>>> area. > >>>>> > >>>> It's not adapter specific, I think. There might be producer-only ports, > >>>> for example, which doesn't provide a constant stream of events, but > >>>> rather intermittent bursts. A traffic generator is one example of such > >>>> an application, and there might be other, less synthetic ones as well. > >>> In that case, the application knows the purpose of the eventdev port. > >>> Is changing eventdev spec to configure "port" or "queue" for that use > >>> case help? Meaning, DSW or > >>> Any driver can get the hint and change the function pointers > >>> accordingly for fastpath. > >>> For instance, do maintenance on enqueue() for such ports or so. > >> > >> This is what DSW does already today. A dequeue() call with a zero-length > >> event array serves the purpose of rte_event_maintain(). It's a bit of a > >> hack, in my opinion. > > I agree that it is the hack. > > > > One more concern we have we are adding API for the new driver requirements > > and > > not updating the example application. Sharing the example application > > patch would > > enable us to understand the cost and usage.(i.e Cost wrt to other SW > > drivers) > > > Good point. I suggest sharing patch based on existing app/adapter usage, based on that, we can analyze more on abstraction and cost analysis. > > > > > > >> > >>>>>>>> + * rte_event_enqueue_burst() are called on a port. This will allow > >>>>>>>> the > >>>>>>>> + * event device to perform internal processing, such as flushing > >>>>>>>> + * buffered events, return credits to a global pool, or process > >>>>>>>> + * signaling related to load balancing. > >>>>>>>> + */ > >> >