<snip>
> Subject: RE: [PATCH v1 1/2] eal: add lcore busyness telemetry > > > From: Anatoly Burakov [mailto:anatoly.bura...@intel.com] > > Sent: Friday, 15 July 2022 15.13 > > > > Currently, there is no way to measure lcore busyness in a passive way, > > without any modifications to the application. This patch adds a new > > EAL API that will be able to passively track core busyness. > > > > The busyness is calculated by relying on the fact that most DPDK API's > > will poll for packets. > > This is an "alternative fact"! Only run-to-completion applications polls for > RX. > Pipelined applications do not poll for packets in every pipeline stage. I guess you meant, poll for packets from NIC. They still need to receive packets from queues. We could do a similar thing for rte_ring APIs. > > > Empty polls can be counted as "idle", while non-empty polls can be > > counted as busy. To measure lcore busyness, we simply call the > > telemetry timestamping function with the number of polls a particular > > code section has processed, and count the number of cycles we've spent > > processing empty bursts. The more empty bursts we encounter, the less > > cycles we spend in "busy" state, and the less core busyness will be > > reported. > > > > In order for all of the above to work without modifications to the > > application, the library code needs to be instrumented with calls to > > the lcore telemetry busyness timestamping function. The following > > parts of DPDK are instrumented with lcore telemetry calls: > > > > - All major driver API's: > > - ethdev > > - cryptodev > > - compressdev > > - regexdev > > - bbdev > > - rawdev > > - eventdev > > - dmadev > > - Some additional libraries: > > - ring > > - distributor > > > > To avoid performance impact from having lcore telemetry support, a > > global variable is exported by EAL, and a call to timestamping > > function is wrapped into a macro, so that whenever telemetry is > > disabled, it only takes one additional branch and no function calls > > are performed. It is also possible to disable it at compile time by > > commenting out RTE_LCORE_BUSYNESS from build config. > > Since all of this can be completely disabled at build time, and thus has > exactly > zero performance impact, I will not object to this patch. > > > > > This patch also adds a telemetry endpoint to report lcore busyness, as > > well as telemetry endpoints to enable/disable lcore telemetry. > > > > Signed-off-by: Kevin Laatz <kevin.la...@intel.com> > > Signed-off-by: Conor Walsh <conor.wa...@intel.com> > > Signed-off-by: David Hunt <david.h...@intel.com> > > Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com> > > --- > > > > Notes: > > We did a couple of quick smoke tests to see if this patch causes > > any performance > > degradation, and it seemed to have none that we could measure. > > Telemetry can be > > disabled at compile time via a config option, while at runtime it > > can be > > disabled, seemingly at a cost of one additional branch. > > > > That said, our benchmarking efforts were admittedly not very > > rigorous, so > > comments welcome! > > This patch does not reflect lcore business, it reflects some sort of ingress > activity level. > > All the considerations regarding non-intrusiveness and low overhead are > good, but everything in this patch needs to be renamed to reflect what it > truly > does, so it is clear that pipelined applications cannot use this telemetry for > measuring lcore business (except on the ingress pipeline stage). > > It's a shame that so much effort clearly has gone into this patch, and no one > stopped to consider pipelined applications. :-(