On Thu, Jun 18, 2020 at 9:33 AM Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> wrote: > > Thanks Jerin for the feedback > > > -----Original Message----- > > From: Jerin Jacob <jerinjac...@gmail.com> > > Sent: Wednesday, June 17, 2020 10:16 AM > > To: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > Cc: dpdk-dev <dev@dpdk.org>; Ali Alnubani <alia...@mellanox.com>; > > orgerl...@mellanox.com; Wenzhuo Lu <wenzhuo...@intel.com>; Beilei Xing > > <beilei.x...@intel.com>; Bernard Iremonger <bernard.iremon...@intel.com>; > > hemant.agra...@nxp.com; jer...@marvell.com; Slava Ovsiienko > > <viachesl...@mellanox.com>; tho...@monjalon.net; Ruifeng Wang > > <ruifeng.w...@arm.com>; Phil Yang <phil.y...@arm.com>; nd > > <n...@arm.com>; Zhihong Wang <zhihong.w...@intel.com>; dpdk stable > > <sta...@dpdk.org> > > Subject: Re: [dpdk-dev] [PATCH 1/5] app/testpmd: clock gettime call in > > throughput calculation > > > > On Wed, Jun 17, 2020 at 8:13 PM Honnappa Nagarahalli > > <honnappa.nagaraha...@arm.com> wrote: > > > > > > The throughput calculation requires a counter that measures passing of > > > time. The PMU cycle counter does not do that. This > > > > > > It is not clear from git commit on why PMU cycle counter does not do that? > > On dpdk bootup, we are figuring out the Hz value based on PMU counter > > cycles. > > What is the missing piece here? > As I understand Linux kernel saves the PMU state and restores it every time a > thread is scheduled out and in. So, when the thread is scheduled out the PMU > cycles are not counted towards that thread. The thread that prints the > statistics issues good amount of system calls and I am guessing it is getting > scheduled out. So, it is reporting very low cycle count.
OK. Probably add this info in git commit. > > > > IMO, PMU counter should have less latency and more granularity than > > clock_getime. > In general, agree. In this particular calculation the granularity has not > mattered much (for ex: numbers are fine with 50Mhz generic counter and 2.5Ghz > CPU). The latency also does not matter as it is getting amortized over a > large number of packets. So, I do not see it affecting the reported PPS/BPS > numbers. Reasonable to use clock_gettime for the control thread. > > > > > > results in incorrect throughput numbers when > > RTE_ARM_EAL_RDTSC_USE_PMU > > > is enabled. Use clock_gettime system call to calculate the time passed > > > since last call. > > > > > > Bugzilla ID: 450 > > > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats") > > > Cc: zhihong.w...@intel.com > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > Reviewed-by: Phil Yang <phil.y...@arm.com> > > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > > --- > > > app/test-pmd/config.c | 44 > > > +++++++++++++++++++++++++++++-------------- > > > 1 file changed, 30 insertions(+), 14 deletions(-) > > > > > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index > > > 016bcb09c..91fbf99f8 100644 > > > --- a/app/test-pmd/config.c > > > +++ b/app/test-pmd/config.c > > > @@ -54,6 +54,14 @@ > > > > > > #define ETHDEV_FWVERS_LEN 32 > > > > > > +#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */ #define > > > +CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW #else #define CLOCK_TYPE_ID > > > +CLOCK_MONOTONIC #endif > > > + > > > +#define NS_PER_SEC 1E9 > > > + > > > static char *flowtype_to_str(uint16_t flow_type); > > > > > > static const struct { > > > @@ -136,9 +144,10 @@ nic_stats_display(portid_t port_id) > > > static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS]; > > > static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS]; > > > static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS]; > > > - static uint64_t prev_cycles[RTE_MAX_ETHPORTS]; > > > + static uint64_t prev_ns[RTE_MAX_ETHPORTS]; > > > + struct timespec cur_time; > > > uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx, > > > - > > > diff_cycles; > > > + > > > + diff_ns; > > > uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx; > > > struct rte_eth_stats stats; > > > struct rte_port *port = &ports[port_id]; @@ -195,10 +204,17 @@ > > > nic_stats_display(portid_t port_id) > > > } > > > } > > > > > > - diff_cycles = prev_cycles[port_id]; > > > - prev_cycles[port_id] = rte_rdtsc(); > > > - if (diff_cycles > 0) > > > - diff_cycles = prev_cycles[port_id] - diff_cycles; > > > + diff_ns = 0; > > > + if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) { > > > + uint64_t ns; > > > + > > > + ns = cur_time.tv_sec * NS_PER_SEC; > > > + ns += cur_time.tv_nsec; > > > + > > > + if (prev_ns[port_id] != 0) > > > + diff_ns = ns - prev_ns[port_id]; > > > + prev_ns[port_id] = ns; > > > + } > > > > > > diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ? > > > (stats.ipackets - prev_pkts_rx[port_id]) : 0; @@ > > > -206,10 +222,10 @@ nic_stats_display(portid_t port_id) > > > (stats.opackets - prev_pkts_tx[port_id]) : 0; > > > prev_pkts_rx[port_id] = stats.ipackets; > > > prev_pkts_tx[port_id] = stats.opackets; > > > - mpps_rx = diff_cycles > 0 ? > > > - diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0; > > > - mpps_tx = diff_cycles > 0 ? > > > - diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0; > > > + mpps_rx = diff_ns > 0 ? > > > + (double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0; > > > + mpps_tx = diff_ns > 0 ? > > > + (double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0; > > > > > > diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ? > > > (stats.ibytes - prev_bytes_rx[port_id]) : 0; @@ > > > -217,10 +233,10 @@ nic_stats_display(portid_t port_id) > > > (stats.obytes - prev_bytes_tx[port_id]) : 0; > > > prev_bytes_rx[port_id] = stats.ibytes; > > > prev_bytes_tx[port_id] = stats.obytes; > > > - mbps_rx = diff_cycles > 0 ? > > > - diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0; > > > - mbps_tx = diff_cycles > 0 ? > > > - diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0; > > > + mbps_rx = diff_ns > 0 ? > > > + (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0; > > > + mbps_tx = diff_ns > 0 ? > > > + (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0; > > > > > > printf("\n Throughput (since last show)\n"); > > > printf(" Rx-pps: %12"PRIu64" Rx-bps: %12"PRIu64"\n Tx- > > pps: %12" > > > -- > > > 2.17.1 > > >