+Chris Brezovec, took part in the discussion in Dublin

> From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com]
> Sent: Friday, 15 September 2023 14.59
> 
> > From: Thomas Monjalon <tho...@monjalon.net>
> > Sent: Tuesday, September 5, 2023 4:40 PM
> >
> > Hello,
> 
> Hi All,
> 
> For context, Thomas and I (and a few others) had a brief discussion
> about this topic
> at userspace in Dublin earlier this week.  I have a bit of better
> understanding of the
> problem-space, and we made some progress in technical solutions too.
> 
> > I think we can improve the developer experience for using service
> cores
> > from a driver, like finding or allocating a service core.
> > We may take some code and ideas from sfc and nfp drivers,
> > like in these functions:
> >     nfp_map_service()
> >     sfc_mae_counter_service_register()
> >     sfc_get_service_lcore()
> >
> > If it is not possible to use a service core, we could default to using
> a control thread.
> > So the driver would never fail because of a thread initialization.
> 
> There was input from a few people that "hidden threads" that their DPDK
> application
> doesn't know about can cause issues (e.g. a driver creating a thread
> "behind the application's back").
> I think Thomas suggested a callback function the application could hook-
> into, to either accept/decline
> the drivers "request" to create a thread.
> 
> The default could be "accept" if the application doesn't hook the
> callback, allowing drivers to default to
> achieving work, and allowing power-users to manually handle specific
> threading-requirements. I have
> not strong preference here, just writing down the discussions and
> feedback from Userspace.
> 
> > What do you think about proposing such a high level API
> > in order to get more drivers using it?
> 
> I believe service-cores was required to transparently enable certain
> use-cases of HW-acceleration,
> Initially Eventdev/SW PMD, but it is of course possible for other
> components in DPDK to use it.
> 
> I do recall some folks had concerns over "scope creep" when initially
> discussing service-cores upstreaming, and perhaps they're right.
> I'm not sure how much more functionality is desired here, vs better
> usability of the service-cores APIs. Perhaps a POC patch of the
> NFP, SFC, etc use-cases would help drive towards a code-level
> discussion?

Since the discussion in Dublin, I have given this some thoughts. Here's what I 
think...

My key objection is this: CPU cycles and CPU cores might be scarce resources. 
And even though a driver has some "important" work to do, the application might 
have some other work to do, which is more important and perhaps also timing 
sensitive. It is a core design principle of DPDK that the application is in 
full control! Drivers should not have the ability to take away resources from 
the application; resources should be explicitly allocated and given to the 
driver by the application.

If I have a piece of single- or dual-core hardware, I certainly don't want any 
drivers to be in control of when they can spend my CPU cycles.

It should be obvious that if a driver needs extra CPU resources (in the form of 
dedicated CPU cores or unregistered non-EAL threads), the driver should 
document this special requirement.

I don't know the drivers' detailed needs for additional threads, but I guess 
that voluntary scheduling would be a good solution for this. And service cores 
is a kind of voluntary scheduling. So instead of offering a fallback to 
unregistered non-EAL threads for such drivers, we could require that 
applications (using such drivers) must make the service cores infrastructure 
available to these drivers. This would leave the application in control of CPU 
scheduling.

<just joking>
Now, let's introduce some of the foreseen scope creep: The service cores 
scheduler could be updated to support real-time requirements, so the 
application's real-time deadlines are not exceeded because of some driver 
hogging the CPU.
</just joking>

Unregistered non-EAL threads should not be used for timing sensitive or 
deadline-bound tasks. They might be affected by "noisy neighbors" taking away 
their CPU time.

Although I don't have a solution, I remain critical: We can either add a "quick 
fix" to continue supporting the mess introduced by these drivers, or we can 
stop to analyze the situation and provide a proper solution.

Reply via email to