[PATCH v1 1/2] eal: add lcore busyness telemetry
Honnappa Nagarahalli
Honnappa.Nagarahalli at arm.com
Sun Jul 17 05:10:02 CEST 2022
<snip>
> Subject: RE: [PATCH v1 1/2] eal: add lcore busyness telemetry
>
> > From: Anatoly Burakov [mailto:anatoly.burakov at intel.com]
> > Sent: Friday, 15 July 2022 15.13
> >
> > Currently, there is no way to measure lcore busyness in a passive way,
> > without any modifications to the application. This patch adds a new
> > EAL API that will be able to passively track core busyness.
> >
> > The busyness is calculated by relying on the fact that most DPDK API's
> > will poll for packets.
>
> This is an "alternative fact"! Only run-to-completion applications polls for RX.
> Pipelined applications do not poll for packets in every pipeline stage.
I guess you meant, poll for packets from NIC. They still need to receive packets from queues. We could do a similar thing for rte_ring APIs.
>
> > Empty polls can be counted as "idle", while non-empty polls can be
> > counted as busy. To measure lcore busyness, we simply call the
> > telemetry timestamping function with the number of polls a particular
> > code section has processed, and count the number of cycles we've spent
> > processing empty bursts. The more empty bursts we encounter, the less
> > cycles we spend in "busy" state, and the less core busyness will be
> > reported.
> >
> > In order for all of the above to work without modifications to the
> > application, the library code needs to be instrumented with calls to
> > the lcore telemetry busyness timestamping function. The following
> > parts of DPDK are instrumented with lcore telemetry calls:
> >
> > - All major driver API's:
> > - ethdev
> > - cryptodev
> > - compressdev
> > - regexdev
> > - bbdev
> > - rawdev
> > - eventdev
> > - dmadev
> > - Some additional libraries:
> > - ring
> > - distributor
> >
> > To avoid performance impact from having lcore telemetry support, a
> > global variable is exported by EAL, and a call to timestamping
> > function is wrapped into a macro, so that whenever telemetry is
> > disabled, it only takes one additional branch and no function calls
> > are performed. It is also possible to disable it at compile time by
> > commenting out RTE_LCORE_BUSYNESS from build config.
>
> Since all of this can be completely disabled at build time, and thus has exactly
> zero performance impact, I will not object to this patch.
>
> >
> > This patch also adds a telemetry endpoint to report lcore busyness, as
> > well as telemetry endpoints to enable/disable lcore telemetry.
> >
> > Signed-off-by: Kevin Laatz <kevin.laatz at intel.com>
> > Signed-off-by: Conor Walsh <conor.walsh at intel.com>
> > Signed-off-by: David Hunt <david.hunt at intel.com>
> > Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> > ---
> >
> > Notes:
> > We did a couple of quick smoke tests to see if this patch causes
> > any performance
> > degradation, and it seemed to have none that we could measure.
> > Telemetry can be
> > disabled at compile time via a config option, while at runtime it
> > can be
> > disabled, seemingly at a cost of one additional branch.
> >
> > That said, our benchmarking efforts were admittedly not very
> > rigorous, so
> > comments welcome!
>
> This patch does not reflect lcore business, it reflects some sort of ingress
> activity level.
>
> All the considerations regarding non-intrusiveness and low overhead are
> good, but everything in this patch needs to be renamed to reflect what it truly
> does, so it is clear that pipelined applications cannot use this telemetry for
> measuring lcore business (except on the ingress pipeline stage).
>
> It's a shame that so much effort clearly has gone into this patch, and no one
> stopped to consider pipelined applications. :-(
More information about the dev
mailing list