[dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter

Jerin Jacob jerin.jacob at caviumnetworks.com
Fri Aug 19 13:46:12 CEST 2016


On Fri, Aug 19, 2016 at 09:43:36AM +0000, Nipun Gupta wrote:
> Hi Jerin,
> 

Hi Nipun,

> > -----Original Message-----
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > Sent: Thursday, August 18, 2016 17:22
> > To: dev at dpdk.org
> > Cc: thomas.monjalon at 6wind.com; jianbo.liu at linaro.org;
> > viktorin at rehivetech.com; Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > Subject: [dpdk-dev] [PATCH] eal/armv8: high-resolution cycle counter
> > 
> > Existing cntvct_el0 based rte_rdtsc() provides portable
> > means to get wall clock counter at user space. Typically
> > it runs at <= 100MHz.
> > 
> > The alternative method to enable rte_rdtsc() for high resolution
> > wall clock counter is through armv8 PMU subsystem.
> > The PMU cycle counter runs at CPU frequency, However,
> > access to PMU cycle counter from user space is not enabled
> > by default in the arm64 linux kernel.
> > It is possible to enable cycle counter at user space access
> > by configuring the PMU from the privileged mode (kernel space).
> > 
> > by default rte_rdtsc() implementation uses portable
> > cntvct_el0 scheme. Application can choose the PMU based
> > implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
> > 
> > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > ---
> > 
> > The PMU based scheme useful for high accuracy performance profiling.
> > Find below the example steps to configure the PMU based cycle counter on an
> > armv8 machine.
> > 
> > # git clone https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0
> > # cd armv8_pmu_cycle_counter_el0
> > # make
> > # sudo insmod pmu_el0_cycle_counter.ko
> > # cd $DPDK_DIR
> > # make config T=arm64-armv8a-linuxapp-gcc
> > # echo "CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU=y" >> build/.config
> > # make -j 4
> 
> Can we make this kernel module also a part of DPDK. May be in the linuxapp so that it is also compiled with DPDK?

I thought so, Later I realized it may not be a good idea to add yet
another out of tree module in DPDK repo and DPDK tries to get rid of
existing out of tree modules.

> 
> > 
> > ---
> >  .../common/include/arch/arm/rte_cycles_64.h        | 33
> > ++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > index 14f2612..867a946 100644
> > --- a/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > +++ b/lib/librte_eal/common/include/arch/arm/rte_cycles_64.h
> > @@ -45,6 +45,11 @@ extern "C" {
> >   * @return
> >   *   The time base for this lcore.
> >   */
> > +#ifndef RTE_ARM_EAL_RDTSC_USE_PMU
> > +/**
> > + * This call is portable to any ARMv8 architecture, however, typically
> > + * cntvct_el0 runs at <= 100MHz and it may be imprecise for some tasks.
> > + */
> >  static inline uint64_t
> >  rte_rdtsc(void)
> >  {
> > @@ -53,6 +58,34 @@ rte_rdtsc(void)
> >  	asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));
> >  	return tsc;
> >  }
> > +#else
> > +/**
> > + * This is an alternative method to enable rte_rdtsc() with high resolution
> > + * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
> > + * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
> > + * access to PMU cycle counter from user space is not enabled by default in
> > + * arm64 linux kernel.
> > + * It is possible to enable cycle counter at user space access by configuring
> > + * the PMU from the privileged mode (kernel space).
> > + *
> > + * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
> > + * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
> > + * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
> > + * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
> > + * val |= (BIT(0) | BIT(2));
> > + * isb();
> > + * asm volatile("msr pmcr_el0, %0" : : "r" (val));
> 
> In your git repo I see that on cleanup the cycle count register is not disabled (PMCNTENCLR_EL0). It shall be better to disable the cycle count register too at module exit.

OK

> 
> > + *
> > + */
> > +static inline uint64_t
> > +rte_rdtsc(void)
> > +{
> > +	uint64_t tsc;
> > +
> > +	asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
> > +	return tsc;
> > +}
> > +#endif
> > 
> >  static inline uint64_t
> >  rte_rdtsc_precise(void)
> > --
> > 2.5.5
> 
> Do you also plan to support performance monitor event counters?

No. This patch was inspired by armv7 PMU scheme and its part of DPDK.
The sole reason to add this support to catch any performance regression
through app/test application.Other than that, I think cntvct_el0 based
existing scheme is good enough for all the use cases.

> 
> Regards,
> Nipun
> 


More information about the dev mailing list