[dpdk-dev] [PATCH v2 1/5] app/testpmd: clock gettime call in throughput calculation

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Mon Jul 6 18:53:12 CEST 2020


Hi Ferruh,
	Thanks for your comments.

<snip>

> Subject: Re: [PATCH v2 1/5] app/testpmd: clock gettime call in throughput
> calculation
> 
> On 6/26/2020 11:09 PM, Honnappa Nagarahalli wrote:
> > The throughput calculation requires a counter that measures passing of
> > time. However, the kernel saves and restores the PMU state when a
> > thread is unscheduled and scheduled. This ensures that the PMU cycles
> > are not counted towards a thread that is not scheduled. Hence, when
> > RTE_ARM_EAL_RDTSC_USE_PMU is enabled, the PMU cycles do not
> represent
> > the passing of time.
> 
> Does this mean 'rte_rdtsc()' is broken for Arm?
It depends on the kernel I think. IMO, for isolated CPUs it should be fine. It is currently getting fixed through a kernel patch. Once the kernel patch is up streamed, we will make the required changes in DPDK.

> Wouldn't it cause more serious issue than testpmd throughput stats?
Within DPDK, it does not seem to be used for anything critical (I have fixed one).

> 
> > This results in incorrect calculation of throughput numbers.
> > Use clock_gettime system call to calculate the time passed since last
> > call.
> >
> > Bugzilla ID: 450
> > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> > Cc: zhihong.wang at intel.com
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > Reviewed-by: Phil Yang <phil.yang at arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > Tested-by: Phil Yang <phil.yang at arm.com>
> > Tested-by: Ali Alnubani <alialnu at mellanox.com>
> 
> <...>
> 
> > @@ -195,10 +204,17 @@ nic_stats_display(portid_t port_id)
> >  		}
> >  	}
> >
> > -	diff_cycles = prev_cycles[port_id];
> > -	prev_cycles[port_id] = rte_rdtsc();
> > -	if (diff_cycles > 0)
> > -		diff_cycles = prev_cycles[port_id] - diff_cycles;
> > +	diff_ns = 0;
> > +	if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
> 
> I guess 'rte_rdtsc()' is faster, but since this is not in the data path, I think it is
> OK.
> 
> <...>
> 
> > @@ -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> >  		(stats.obytes - prev_bytes_tx[port_id]) : 0;
> >  	prev_bytes_rx[port_id] = stats.ibytes;
> >  	prev_bytes_tx[port_id] = stats.obytes;
> > -	mbps_rx = diff_cycles > 0 ?
> > -		diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > -	mbps_tx = diff_cycles > 0 ?
> > -		diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > +	mbps_rx = diff_ns > 0 ?
> > +		(double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> > +	mbps_tx = diff_ns > 0 ?
> > +		(double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
> 
> The calculation also fixes other issue in the stats.
> With previous method, if the sampling between two stats is a little long,
> "diff_pkts_rx * rte_get_tsc_hz()" can overflow and produce wrong 'bps'.
> 
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit at intel.com>


More information about the dev mailing list