On 2023-04-19 13:06, Jerin Jacob wrote: > On Mon, Apr 17, 2023 at 9:06 PM Mattias Rönnblom > <mattias.ronnb...@ericsson.com> wrote: >> >> On 2023-04-17 14:52, Jerin Jacob wrote: >>> On Thu, Apr 13, 2023 at 12:24 PM Mattias Rönnblom >>> <mattias.ronnb...@ericsson.com> wrote: >>>> >>>> >>>> void >>>> rte_event_return_new_credits(...); >>>> >>>> Thoughts? >>> >>> I see the following cons on this approach. >>> >> >> Does the use case in my original e-mail seem like a reasonable one to >> you? If yes, is there some way one could solve this problem with a >> clever use of the current Eventdev API? That would obviously be preferable. > > I think, the use case is reasonable. For me, most easy path to achieve > the functionality > is setting rte_event_dev_config::nb_events_limit as for a given > application always > targeted to work X number of packets per second. Giving that upfront > kind of make > life easy for application writers and drivers at the cost of > allocating required memory. >
Could you unpack that a little? How would you derive the nb_events_limit from the targeted pps throughput? In my world, they are pretty much orthogonal. nb_events_limit just specifies the maximum number of buffered events (i.e., events/packets in-flight in the pipeline). Are you thinking about a system where you do input rate shaping (e.g, on the aggregate flow, e.g., from NIC+timer wheel), to some fixed rate? A rate you know with some reasonable certainty can be sustained. Most non-trivial applications will vary in capacity depending on packet size, number of flows, types of flows, flow life time, non-packet processing cache or DDR pressure, etc. In any system where you never risk accepting new items of work at a higher pace than the system is able to finish them, any mechanism designed to help you deal with work scheduler back pressure (at the point of new event enqueue) is pointless of course. Or maybe you are you thinking of a system where the EAL threads almost never enqueues new events (only forward-type events)? In other words, a system where NIC and/or timer hardware is the source of almost-all new work, and both of those are tightly integrated with the event device? >> >>> # Adding multiple APIs in fast path to driver layer may not >>> performance effective solution. >> >> For event devices with a software-managed credit system, pre-allocation >> would be very cheap. And, if an application prefer to handle back >> pressure after-the-fact, that option is still available. > > I am worried about exposing PMD calls and application starts calling per > packet, > e.s.p with burst size = 1 for latency critical applications. > > >> >>> # At least for cnxk HW, credits are for device, not per port. So cnxk >>> HW implementation can not use this scheme. >>> >> >> DSW's credit pool is also per device, but are cached on a per-port >> basis. Does cnxk driver rely on the hardware to signal "new event" back >> pressure? (From the driver code it looks like that is the case.) > > Yes. But we can not really cache it per port without introducing > complex atomic logic. > You could defer back pressure management to software altogether. If you trade some accuracy (in terms of exactly how many in-flight events are allowed), the mechanism is both pretty straight-forward to implement and cycle-efficient. >> >>> Alternative solution could be, adding new flag for >>> rte_enqueue_new_burst(), where drivers waits until credit is available >>> to reduce the application overhead > and support in different HW >>> implementations if this use case critical. >>> >>> #define RTE_EVENT_FLAG_WAIT_TILL_CREDIT_AVILABLE (UINT32_C(1) << 0) >>> >>> >> >> This solution only works if the event device is the only source of work >> for the EAL thread. That is a really nice model, but I wouldn't trust on >> that to always be the case. > > For non EAL thread, I am assuming it is HW event adapter kind of case, What case is this? I think we can leave out non-EAL threads (registered threads, unregistered can't even call into the Eventdev API), since to the extent the use an event device, it will be very limited. > In such case, they don't need to wait. I think, for SW EAL threads case only > we need to wait as application is expecting to make sure wait till > the credit is available to avoid error handling in application. > That sounds potentially very wasteful, if the time it has to wait is long. In the worst case, if all lcores hit this limit at the same time, the result is a deadlock, where every thread waits for some other threads to finish off enough work from the pipeline's backlog, to make the in-flight events go under the new_event_threshold. >> >> Also, there may be work that should only be performed, if the system is >> not under very high load. Credits being available, especially combined >> with a flexible new even threshold would be an indicator. >> >> Another way would be to just provide an API call that gave an indication >> of a particular threshold has been reached (or simply return an >> approximation of the number of in-flight events). Such a mechanism >> wouldn't be able to leave any guarantees, but could make a future >> enqueue operation very likely to succeed. > > Giving rte_event_dev_credits_avaiable(device_id) should be OK provided > it is not expecting fine-grained accuracy. But my worry is applications > starts > calling that per packet. Marking correct documentation may help. Not sure. > >> >>>> >>>> Best regards, >>>> Mattias >>