+Chris Brezovec, took part in the discussion in Dublin > From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com] > Sent: Friday, 15 September 2023 14.59 > > > From: Thomas Monjalon <tho...@monjalon.net> > > Sent: Tuesday, September 5, 2023 4:40 PM > > > > Hello, > > Hi All, > > For context, Thomas and I (and a few others) had a brief discussion > about this topic > at userspace in Dublin earlier this week. I have a bit of better > understanding of the > problem-space, and we made some progress in technical solutions too. > > > I think we can improve the developer experience for using service > cores > > from a driver, like finding or allocating a service core. > > We may take some code and ideas from sfc and nfp drivers, > > like in these functions: > > nfp_map_service() > > sfc_mae_counter_service_register() > > sfc_get_service_lcore() > > > > If it is not possible to use a service core, we could default to using > a control thread. > > So the driver would never fail because of a thread initialization. > > There was input from a few people that "hidden threads" that their DPDK > application > doesn't know about can cause issues (e.g. a driver creating a thread > "behind the application's back"). > I think Thomas suggested a callback function the application could hook- > into, to either accept/decline > the drivers "request" to create a thread. > > The default could be "accept" if the application doesn't hook the > callback, allowing drivers to default to > achieving work, and allowing power-users to manually handle specific > threading-requirements. I have > not strong preference here, just writing down the discussions and > feedback from Userspace. > > > What do you think about proposing such a high level API > > in order to get more drivers using it? > > I believe service-cores was required to transparently enable certain > use-cases of HW-acceleration, > Initially Eventdev/SW PMD, but it is of course possible for other > components in DPDK to use it. > > I do recall some folks had concerns over "scope creep" when initially > discussing service-cores upstreaming, and perhaps they're right. > I'm not sure how much more functionality is desired here, vs better > usability of the service-cores APIs. Perhaps a POC patch of the > NFP, SFC, etc use-cases would help drive towards a code-level > discussion?
Since the discussion in Dublin, I have given this some thoughts. Here's what I think... My key objection is this: CPU cycles and CPU cores might be scarce resources. And even though a driver has some "important" work to do, the application might have some other work to do, which is more important and perhaps also timing sensitive. It is a core design principle of DPDK that the application is in full control! Drivers should not have the ability to take away resources from the application; resources should be explicitly allocated and given to the driver by the application. If I have a piece of single- or dual-core hardware, I certainly don't want any drivers to be in control of when they can spend my CPU cycles. It should be obvious that if a driver needs extra CPU resources (in the form of dedicated CPU cores or unregistered non-EAL threads), the driver should document this special requirement. I don't know the drivers' detailed needs for additional threads, but I guess that voluntary scheduling would be a good solution for this. And service cores is a kind of voluntary scheduling. So instead of offering a fallback to unregistered non-EAL threads for such drivers, we could require that applications (using such drivers) must make the service cores infrastructure available to these drivers. This would leave the application in control of CPU scheduling. <just joking> Now, let's introduce some of the foreseen scope creep: The service cores scheduler could be updated to support real-time requirements, so the application's real-time deadlines are not exceeded because of some driver hogging the CPU. </just joking> Unregistered non-EAL threads should not be used for timing sensitive or deadline-bound tasks. They might be affected by "noisy neighbors" taking away their CPU time. Although I don't have a solution, I remain critical: We can either add a "quick fix" to continue supporting the mess introduced by these drivers, or we can stop to analyze the situation and provide a proper solution.