On 7/6/2020 5:53 PM, Honnappa Nagarahalli wrote: > Hi Ferruh, > Thanks for your comments. > > <snip> > >> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput >> calculation >> >> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote: >>> The throughput calculation requires a counter that measures passing of >>> time. However, the kernel saves and restores the PMU state when a >>> thread is unscheduled and scheduled. This ensures that the PMU cycles >>> are not counted towards a thread that is not scheduled. Hence, when >>> RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not >> represent >>> the passing of time. >> >> Does this mean 'rte_rdtsc()' is broken for Arm? > It depends on the kernel I think. IMO, for isolated CPUs it should be fine. > It is currently getting fixed through a kernel patch. Once the kernel patch > is up streamed, we will make the required changes in DPDK. > >> Wouldn't it cause more serious issue than testpmd throughput stats? > Within DPDK, it does not seem to be used for anything critical (I have fixed > one).
OK, thanks for clarification. > >> >>> This results in incorrect calculation of throughput numbers. >>> Use clock_gettime system call to calculate the time passed since last >>> call. >>> >>> Bugzilla ID: 450 >>> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats") >>> Cc: zhihong.w...@intel.com >>> Cc: sta...@dpdk.org >>> >>> Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> >>> Reviewed-by: Phil Yang <phil.y...@arm.com> >>> Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> >>> Tested-by: Phil Yang <phil.y...@arm.com> >>> Tested-by: Ali Alnubani <alia...@mellanox.com> >> >> <...> >> >>> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id) >>> } >>> } >>> >>> - diff_cycles = prev_cycles[port_id]; >>> - prev_cycles[port_id] = rte_rdtsc(); >>> - if (diff_cycles > 0) >>> - diff_cycles = prev_cycles[port_id] - diff_cycles; >>> + diff_ns = 0; >>> + if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) { >> >> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I >> think it is >> OK. >> >> <...> >> >>> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id) >>> (stats.obytes - prev_bytes_tx[port_id]) : 0; >>> prev_bytes_rx[port_id] = stats.ibytes; >>> prev_bytes_tx[port_id] = stats.obytes; >>> - mbps_rx = diff_cycles > 0 ? >>> - diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0; >>> - mbps_tx = diff_cycles > 0 ? >>> - diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0; >>> + mbps_rx = diff_ns > 0 ? >>> + (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0; >>> + mbps_tx = diff_ns > 0 ? >>> + (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0; >> >> The calculation also fixes other issue in the stats. >> With previous method, if the sampling between two stats is a little long, >> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'. >> >> >> Reviewed-by: Ferruh Yigit <ferruh.yi...@intel.com>