> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> Sent: Monday, 3 October 2022 22.02

[...]

> The functionality provided is very useful, and the implementation is
> clever in the way it doesn't require any application modifications.
> But,
> a clever, useful brittle hack is still a brittle hack.
> 
> What if there was instead a busyness module, where the application
> would
> explicitly report what it was up to. The new library would hook up to
> telemetry just like this patchset does, plus provide an explicit API to
> retrieve lcore thread load.
> 
> The service cores framework (fancy name for rte_service.c) could also
> call the lcore load tracking module, provided all services properly
> reported back on whether or not they were doing anything useful with
> the
> cycles they just spent.
> 
> The metrics of such a load tracking module could potentially be used by
> other modules in DPDK, or by the application. It could potentially be
> used for dynamic load balancing of service core services, or for power
> management (e.g, DVFS), or for a potential future deferred-work type
> mechanism more sophisticated than current rte_service, or some green
> threads/coroutines/fiber thingy. The DSW event device could also use it
> to replace its current internal load estimation scheme.

[...]

I agree 100 % with everything Mattias wrote above, and I would like to voice my 
opinion too.

This patch is full of preconditions and assumptions. Its only true advantage 
(vs. a generic load tracking library) is that it doesn't require any 
application modifications, and thus can be deployed with zero effort.

I my opinion, it would be much better with a well designed generic load 
tracking library, to be called from the application, so it gets correct 
information about what the lcores spend their cycles doing. And as Mattias 
mentions: With the appropriate API for consumption of the collected data, it 
could also provide actionable statistics for use by the application itself, not 
just telemetry. ("Actionable statistics": Statistics that is directly usable 
for decision making.)

There is also the aspect of time-to-benefit: This patch immediately provides 
benefits (to the users of the DPDK applications that meet the 
preconditions/assumptions of the patch), while a generic load tracking library 
will take years to get integrated into applications before it provides benefits 
(to the users of the DPDK applications that use the new library).

So, we should ask ourselves: Do we want an application-specific solution with a 
short time-to-benefit, or a generic solution with a long time-to-benefit? (I 
use the term "application specific" because not all applications can be tweaked 
to provide meaningful data with this patch. You might also label a generic 
library "application specific", because it requires that the application uses 
the library - however that is a common requirement of all DPDK libraries.)

Furthermore, if the proposed patch is primarily for the benefit of OVS, I 
suppose that calls to a generic load tracking library could be added to OVS 
within a relatively short time frame (although not as quick as this patch).

I guess that the developers of this patch initially thought that it was generic 
and usable for the majority of applications, and it came as somewhat a surprise 
that it wasn't as generic as expected. The DPDK community has a good review 
process with open discussions and sharing of thoughts and ideas. Sometimes, an 
idea doesn't fly, because the corner cases turn out to be more common than 
expected. I'm sorry to say it, but I think that is the case for this patch. :-(

-Morten

Reply via email to