On Thu, Jun 18, 2020 at 9:33 AM Honnappa Nagarahalli
<honnappa.nagaraha...@arm.com> wrote:
>
> Thanks Jerin for the feedback
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjac...@gmail.com>
> > Sent: Wednesday, June 17, 2020 10:16 AM
> > To: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > Cc: dpdk-dev <dev@dpdk.org>; Ali Alnubani <alia...@mellanox.com>;
> > orgerl...@mellanox.com; Wenzhuo Lu <wenzhuo...@intel.com>; Beilei Xing
> > <beilei.x...@intel.com>; Bernard Iremonger <bernard.iremon...@intel.com>;
> > hemant.agra...@nxp.com; jer...@marvell.com; Slava Ovsiienko
> > <viachesl...@mellanox.com>; tho...@monjalon.net; Ruifeng Wang
> > <ruifeng.w...@arm.com>; Phil Yang <phil.y...@arm.com>; nd
> > <n...@arm.com>; Zhihong Wang <zhihong.w...@intel.com>; dpdk stable
> > <sta...@dpdk.org>
> > Subject: Re: [dpdk-dev] [PATCH 1/5] app/testpmd: clock gettime call in
> > throughput calculation
> >
> > On Wed, Jun 17, 2020 at 8:13 PM Honnappa Nagarahalli
> > <honnappa.nagaraha...@arm.com> wrote:
> > >
> > > The throughput calculation requires a counter that measures passing of
> > > time. The PMU cycle counter does not do that. This
> >
> >
> > It is not clear from git commit on why PMU cycle counter does not do that?
> > On dpdk bootup, we are figuring out the Hz value based on PMU counter
> > cycles.
> > What is the missing piece here?
> As I understand Linux kernel saves the PMU state and restores it every time a 
> thread is scheduled out and in. So, when the thread is scheduled out the PMU 
> cycles are not counted towards that thread. The thread that prints the 
> statistics issues good amount of system calls and I am guessing it is getting 
> scheduled out. So, it is reporting very low cycle count.

OK. Probably add this info in git commit.

> >
> > IMO, PMU counter should have less latency and more granularity than
> > clock_getime.
> In general, agree. In this particular calculation the granularity has not 
> mattered much (for ex: numbers are fine with 50Mhz generic counter and 2.5Ghz 
> CPU). The latency also does not matter as it is getting amortized over a 
> large number of packets. So, I do not see it affecting the reported PPS/BPS 
> numbers.

Reasonable to use clock_gettime for the control thread.

>
> >
> > > results in incorrect throughput numbers when
> > RTE_ARM_EAL_RDTSC_USE_PMU
> > > is enabled. Use clock_gettime system call to calculate the time passed
> > > since last call.
> > >
> > > Bugzilla ID: 450
> > > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> > > Cc: zhihong.w...@intel.com
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > Reviewed-by: Phil Yang <phil.y...@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> > > ---
> > >  app/test-pmd/config.c | 44
> > > +++++++++++++++++++++++++++++--------------
> > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> > > 016bcb09c..91fbf99f8 100644
> > > --- a/app/test-pmd/config.c
> > > +++ b/app/test-pmd/config.c
> > > @@ -54,6 +54,14 @@
> > >
> > >  #define ETHDEV_FWVERS_LEN 32
> > >
> > > +#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */ #define
> > > +CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW #else #define CLOCK_TYPE_ID
> > > +CLOCK_MONOTONIC #endif
> > > +
> > > +#define NS_PER_SEC 1E9
> > > +
> > >  static char *flowtype_to_str(uint16_t flow_type);
> > >
> > >  static const struct {
> > > @@ -136,9 +144,10 @@ nic_stats_display(portid_t port_id)
> > >         static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS];
> > >         static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS];
> > >         static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS];
> > > -       static uint64_t prev_cycles[RTE_MAX_ETHPORTS];
> > > +       static uint64_t prev_ns[RTE_MAX_ETHPORTS];
> > > +       struct timespec cur_time;
> > >         uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx,
> > > -                                                               
> > > diff_cycles;
> > > +
> > > + diff_ns;
> > >         uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx;
> > >         struct rte_eth_stats stats;
> > >         struct rte_port *port = &ports[port_id]; @@ -195,10 +204,17 @@
> > > nic_stats_display(portid_t port_id)
> > >                 }
> > >         }
> > >
> > > -       diff_cycles = prev_cycles[port_id];
> > > -       prev_cycles[port_id] = rte_rdtsc();
> > > -       if (diff_cycles > 0)
> > > -               diff_cycles = prev_cycles[port_id] - diff_cycles;
> > > +       diff_ns = 0;
> > > +       if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
> > > +               uint64_t ns;
> > > +
> > > +               ns = cur_time.tv_sec * NS_PER_SEC;
> > > +               ns += cur_time.tv_nsec;
> > > +
> > > +               if (prev_ns[port_id] != 0)
> > > +                       diff_ns = ns - prev_ns[port_id];
> > > +               prev_ns[port_id] = ns;
> > > +       }
> > >
> > >         diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ?
> > >                 (stats.ipackets - prev_pkts_rx[port_id]) : 0; @@
> > > -206,10 +222,10 @@ nic_stats_display(portid_t port_id)
> > >                 (stats.opackets - prev_pkts_tx[port_id]) : 0;
> > >         prev_pkts_rx[port_id] = stats.ipackets;
> > >         prev_pkts_tx[port_id] = stats.opackets;
> > > -       mpps_rx = diff_cycles > 0 ?
> > > -               diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > > -       mpps_tx = diff_cycles > 0 ?
> > > -               diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > > +       mpps_rx = diff_ns > 0 ?
> > > +               (double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0;
> > > +       mpps_tx = diff_ns > 0 ?
> > > +               (double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0;
> > >
> > >         diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ?
> > >                 (stats.ibytes - prev_bytes_rx[port_id]) : 0; @@
> > > -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> > >                 (stats.obytes - prev_bytes_tx[port_id]) : 0;
> > >         prev_bytes_rx[port_id] = stats.ibytes;
> > >         prev_bytes_tx[port_id] = stats.obytes;
> > > -       mbps_rx = diff_cycles > 0 ?
> > > -               diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > > -       mbps_tx = diff_cycles > 0 ?
> > > -               diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > > +       mbps_rx = diff_ns > 0 ?
> > > +               (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> > > +       mbps_tx = diff_ns > 0 ?
> > > +               (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
> > >
> > >         printf("\n  Throughput (since last show)\n");
> > >         printf("  Rx-pps: %12"PRIu64"          Rx-bps: %12"PRIu64"\n  Tx-
> > pps: %12"
> > > --
> > > 2.17.1
> > >

Reply via email to