[PATCH v4 1/4] eal: add generic support for reading PMU events

Morten Brørup mb at smartsharesystems.com
Thu Dec 15 09:22:10 CET 2022


> From: Morten Brørup [mailto:mb at smartsharesystems.com]
> Sent: Wednesday, 14 December 2022 11.41
> 
> +CC: Mattias, see my comment below about per-thread constructor for
> this
> 
> > From: Tomasz Duszynski [mailto:tduszynski at marvell.com]
> > Sent: Wednesday, 14 December 2022 10.39
> >
> > Hello Morten,
> >
> > Thanks for review. Answers inline.
> >
> > [...]
> >
> > > > +__rte_experimental
> > > > +static __rte_always_inline uint64_t
> > > > +rte_pmu_read(int index)
> 
> The index type can be changed from int to uint32_t. This also
> eliminates the "(index < 0" part of the comparison further below in
> this function.
> 
> > > > +{
> > > > +	int lcore_id = rte_lcore_id();
> > > > +	struct rte_pmu_event_group *group;
> > > > +	int ret;
> > > > +
> > > > +	if (!rte_pmu)
> > > > +		return 0;
> > > > +
> > > > +	group = &rte_pmu->group[lcore_id];
> > > > +	if (!group->enabled) {
> 
> Optimized: if (unlikely(!group->enabled)) {
> 
> > > > +		ret = rte_pmu_enable_group(lcore_id);
> > > > +		if (ret)
> > > > +			return 0;
> > > > +
> > > > +		group->enabled = true;
> > > > +	}
> > >
> > > Why is the group not enabled in the setup function,
> > rte_pmu_add_event(), instead of here, in the
> > > hot path?
> > >
> >
> > When this is executed for the very first time then cpu will have
> > obviously more work to do
> > but afterwards setup path is not taken hence much less cpu cycles are
> > required.
> >
> > Setup is executed by main lcore solely, before lcores are executed
> > hence some info passed to
> > SYS_perf_event_open ioctl() is missing, pid (via rte_gettid()) being
> an
> > example here.
> 
> OK. Thank you for the explanation. Since impossible at setup, it has to
> be done at runtime.
> 
> @Mattias: Another good example of something that would belong in per-
> thread constructors, as my suggested feature creep in [1].
> 
> [1]:
> http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87553@smarts
> erver.smartshare.dk/

I just realized that this initialization is per-lcore (not per thread), so you can use rte_lcore_callback_register() to register a per-lcore initialization function, and move rte_pmu_enable_group(lcore_id) there.

-Morten



More information about the dev mailing list