[dpdk-dev] [PATCH v2 00/16] update and simplify telemetry library.

Thomas Monjalon thomas at monjalon.net
Fri Apr 10 16:51:03 CEST 2020


10/04/2020 16:39, Wiles, Keith:
> > On Apr 9, 2020, at 4:37 AM, Thomas Monjalon <thomas at monjalon.net> wrote:
> > 09/04/2020 11:19, Bruce Richardson:
> >> On Wed, Apr 08, 2020 at 08:03:26PM +0200, Thomas Monjalon wrote:
> >>> 08/04/2020 18:49, Ciara Power:
> >>>> This patchset extensively reworks the telemetry library adding new
> >>>> functionality and simplifying much of the existing code, while
> >>>> maintaining backward compatibility.
> >>>> 
> >>>> This work is based on the previously sent RFC for a "process info"
> >>>> library: https://patchwork.dpdk.org/project/dpdk/list/?series=7741
> >>>> However, rather than creating a new library, this patchset takes
> >>>> that work and merges it into the existing telemetry library, as
> >>>> mentioned above.
> >>>> 
> >>>> The telemetry library as shipped in 19.11 is based upon the metrics
> >>>> library and outputs all statistics based on that as a source. However,
> >>>> this limits the telemetry output to only port-level statistics
> >>>> information, rather than allowing it to be used as a general scheme for
> >>>> telemetry information across all DPDK libraries.
> >>>> 
> >>>> With this patchset applied, rather than the telemetry library being
> >>>> responsible for pulling ethdev stats and pushing them into the metrics
> >>>> library for retrieval later, each library e.g. ethdev, rawdev, and even
> >>>> the metrics library itself (for backwards compatiblity) now handle their
> >>>> own stats.  Any library or app can register a callback function with
> >>>> telemetry, which will be called if requested by the client connected via
> >>>> the telemetry socket. The callback function in the library/app then
> >>>> formats its stats, or other data, into a JSON string, and returns it to
> >>>> telemetry to be sent to the client.
> >>> 
> >>> I think this is a global need in DPDK, and it is usually called RPC,
> >>> IPC or control messaging.
> >>> We had a similar need for multi-process communication, thus rte_mp IPC.
> >>> We also need a control channel for user configuration applications.
> >>> We also need to control some features like logging or tracing.
> >>> 
> >>> In my opinion, it is time to introduce a general control channel in DPDK.
> >>> The application must be in the loop of the control mechanism.
> >>> Making such channel standard will ease application adoption.
> >>> 
> >>> Please read some comments here:
> >>> http://inbox.dpdk.org/dev/2580933.jp2sp48Hzj@xps/
> >>> 
> >> Hi Thomas,
> >> 
> >> I agree that having a single control mechanism or messaging mechanism in
> >> DPDK would be nice to have. However, I don't believe the plans for such a
> >> scheme should impact this patchset right now as the idea of a common
> >> channel was only first mooted about a week ago, and while there has been
> >> some email discussion about it, there is as yet no requirements list that
> >> I've seen, nobody actually doing coding work on it, no rfc and most
> >> importantly no timeline for creating and merging such into DPDK.
> > 
> > Yes, this is a new idea.
> > Throwing the idea in this "telemetry" thread and in "IF proxy" thread
> > is the first step before starting a dedicated thread to design
> > a generic mechanism.
> > 
> >> At present though, DPDK has a telemetry solution that works for the use case
> >> of ethdev stats and some power management info, but requires a more general
> >> solution to allow monitoring tools like PMDT to introspect DPDK, and also
> >> to prove statistics for other parts of DPDK such as cryptodev, eventdev,
> >> and other libraries, plus the application itself if the app so desires.
> > 
> > Doing rework on telemetry is similar to a general control mechanism.
> > Can we take this opportunity to work on what we believe to be a bigger
> > idea? It should be done anyway, so why pushing this temporary solution?
> > Sometimes we need a quick answer to an urgent problem.
> > But I don't think telemetry is currently in such situation that
> > a rework in 20.05 is mandatory.
> 
> Updating telemetry to be more consumable and standardize on a single method to get stats/info out of DPDK is a clean and simple solution. Starting over and creating yet another solution means we are pushing this support out again and many customer are asking for this support now.
> 
> The current telemetry solution in this patch gives us a great starting point and going back to the drawing board is a waste of time IMO and we need something now. To me this is a urgent problem we need to solve now, as I want to push PMDT and if we keep pushing out this type of support then it will never be upstreamed.
> 
> In PMDT I believe I have resolved all of the tech boards concerns and just waiting for this patch and a patch to PCM to push the code back to DPDK again.
> 
> So please let's not redesign this again.

I understand your concern.

I think we need to go to the drawing board,
and consider at least these 5 use cases:
	1/ multi-process IPC
	2/ telemetry
	3/ IF proxy
	4/ external user configuration
	5/ log/trace start/stop

Merging telemetry means we'll rework 1 and 2 later.
I am OK with merging telemetry in 20.05 if we can be sure
that there will be no resistance and help for reworking it
with a more general communication channel if required later.

We need a kind of community vote here. Please give +1 / -1.
Giving +1 means you will help when needed.




More information about the dev mailing list