> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hideyuki Yamashita
> Sent: Wednesday, November 25, 2020 6:40 AM
> 
> Hello,
> 
> Following are the work items planned for 21.02 from NTT TechnoCross:
> I will try to post patch set after 20.11 is released.
> 
> ---
> 1) Introduce API stats function
> In general, DPDK application consumes CPU usage because it polls
> incoming packets using rx_burst API in infinite loop.
> This makes difficult to estimate how much CPU usage is really
> used to send/receive packets by the DPDK application.
> 
> For example, even if no incoming packets arriving, CPU usage
> looks nearly 100% when observed by top command.
> 
> It is beneficial if developers can observe real CPU usage of the
> DPDK application.
> Such information can be exported to monitoring application like
> prometheus/graphana and shows CPU usage graphically.

This would be very beneficial.

Unfortunately, this seems to be not so simple for applications like the 
SmartShare StraightShaper, which is not a simple packet forwarding application, 
but has multiple pipeline stages. Our application also keeps some packets in 
queues for shaping purposes, so the number of packets transmitted does not 
match the number of packets received within some time interval.

> 
> To achieve above, this patch set provides apistats functionality.
> apistats provides the followiing two counters for each lcore.
> - rx_burst_counts[RTE_MAX_LCORE]
> - tx_burst_counts[RTE_MAX_LCORE]
> Those accumulates rx_burst/tx_burst counts since the application
> starts.
> 
> By using those values, developers can roughly estimate CPU usage.
> Let us assume a DPDK application is simply forwarding packets.
> It calls tx_burst only if it receive packets.
> If rx_burst_counts=1000 and tx_burst_count=1000 during certain
> period of time, one can assume CPU usage is 100%.
> If rx_burst_counts=1000 and tx_burst_count=100 during certain
> period of time, one can assume CPU usage is 10%.
> Here we assumes that tx_burst_count equals counts which rx_burst
> function
> really receives incoming packets.

I am not sure I understand what is being counted in these counters. The number 
of packets in the bursts, or the number of invocations of the rx_burst/tx_burst 
functions.


Here are some data from our purpose built profiler, illustrating how nonlinear 
this really is. These data are from a SmartShare appliance in live production 
at an ISP. I hope you find it useful:

Rx_burst uses ca. 40 CPU cycles if there are no packets, ca. 260 cycles if 
there is one packet, and down to ca. 40 cycles per packet for a burst of many 
packets.

Tx_burst uses ca. 350 cycles for one packet, and down to ca. 20 cycles per 
packet for a burst of many packets.

One of our intermediate pipeline stages (which not is not receiving or 
transmitting packets, only processing them) uses ca. 150 cycles for a burst of 
one packet, and down to ca. 110 cycles for a burst of many packets.


Nevertheless, your suggested API might be usable by simple 
ingress->routing->egress applications. So don’t let me discourage you!


Reply via email to