On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> The throughput calculation requires a counter that measures
> passing of time. However, the kernel saves and restores the PMU
> state when a thread is unscheduled and scheduled. This ensures
> that the PMU cycles are not counted towards a thread that is
> not scheduled. Hence, when RTE_ARM_EAL_RDTSC_USE_PMU is enabled,
> the PMU cycles do not represent the passing of time.

Does this mean 'rte_rdtsc()' is broken for Arm?
Wouldn't it cause more serious issue than testpmd throughput stats?

> This results in incorrect calculation of throughput numbers.
> Use clock_gettime system call to calculate the time passed since
> last call.
> 
> Bugzilla ID: 450
> Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> Cc: zhihong.w...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Reviewed-by: Phil Yang <phil.y...@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> Tested-by: Phil Yang <phil.y...@arm.com>
> Tested-by: Ali Alnubani <alia...@mellanox.com>

<...>

> @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
>               }
>       }
>  
> -     diff_cycles = prev_cycles[port_id];
> -     prev_cycles[port_id] = rte_rdtsc();
> -     if (diff_cycles > 0)
> -             diff_cycles = prev_cycles[port_id] - diff_cycles;
> +     diff_ns = 0;
> +     if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {

I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think
it is OK.

<...>

> @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
>               (stats.obytes - prev_bytes_tx[port_id]) : 0;
>       prev_bytes_rx[port_id] = stats.ibytes;
>       prev_bytes_tx[port_id] = stats.obytes;
> -     mbps_rx = diff_cycles > 0 ?
> -             diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> -     mbps_tx = diff_cycles > 0 ?
> -             diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> +     mbps_rx = diff_ns > 0 ?
> +             (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> +     mbps_tx = diff_ns > 0 ?
> +             (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;

The calculation also fixes other issue in the stats.
With previous method, if the sampling between two stats is a little long,
"diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.


Reviewed-by: Ferruh Yigit <ferruh.yi...@intel.com>

Reply via email to