On Wed, 3 May 2023 15:32:27 +0200 Morten Brørup <m...@smartsharesystems.com> wrote:
> > From: David Coyle [mailto:david.co...@intel.com] > > Sent: Wednesday, 3 May 2023 13.39 > > > > This is NOT for upstreaming. This is being submitted to allow early > > comparison testing with the preferred solution, which will add TAPUSE > > power management support to the ring library through the addition of > > callbacks. Initial stages of the preferred solution are available at > > http://dpdk.org/patch/125454. > > > > This patch adds functionality directly to rte_ring_dequeue functions to > > monitor the empty reads of the ring. When a configurable number of > > empty reads is reached, a TPAUSE instruction is triggered by using > > rte_power_pause() on supported architectures. rte_pause() is used on > > other architectures. The functionality can be included or excluded at > > compilation time using the RTE_RING_PMGMT flag. If included, the new > > API can be used to enable/disable the feature on a per-ring basis. > > Other related settings can also be configured using the API. > > I don't understand why DPDK developers keep spending time on trying to invent > methods to determine application busyness based on entry/exit points in a > variety of libraries, when the application is in a much better position to > determine busyness. All of these "busyness measuring" library extensions have > their own specific assumptions and weird limitations. > > I do understand that the goal is power saving, which certainly is relevant! I > only criticize the measuring methods. > > For reference, we implemented something very simple in our application > framework: > 1. When each pipeline stage has completed a burst, it reports if it was busy > or not. > 2. If the pipeline busyness is low, we take a nap to save some power. > > And here is the magic twist to this simple algorithm: > 3. A pipeline stage is not considered busy unless it processed a full burst, > and is ready to process more packets immediately. This interpretation of > busyness has a significant impact on the percentage of time spent napping > during the low-traffic hours. > > This algorithm was very quickly implemented. It might not be perfect, and we > do intend to improve it (also to determine CPU Utilization on a scale that > the end user can translate to a linear interpretation of how busy the system > is). But I seriously doubt that any of the proposed "busyness measuring" > library extensions are any better. > > So: The application knows better, please spend your precious time on > something useful instead. Agree, with Morten. This is not the place to add more code. Does this have an applicable use case to common DPDK applications like OVS, VPP, or even Contrail? Or is it some intern experiment kind of thing. The existing l3fwd is an example of how to do PM in application. Switching to interrupt mode lets the kernel help out. And the kernel will no how to handle multicore etc. > @David, my outburst is not directed at you specifically. Generally, I do > appreciate experimenting as a good way of obtaining knowledge. So thank you > for sharing your experiments with this audience! > > PS: If cruft can be disabled at build time, I generally don't oppose to it. Cruft increases test coverage surface, creates technical debt, and makes Linux distro maintainers upset.