[dpdk-dev] [dpdk-techboard] DPDK ABI/API Stability

Burakov, Anatoly anatoly.burakov at intel.com
Mon Apr 8 17:49:43 CEST 2019


On 08-Apr-19 3:38 PM, David Marchand wrote:
> 
> 
> On Mon, Apr 8, 2019 at 4:03 PM Burakov, Anatoly 
> <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>> wrote:
> 
>     On 08-Apr-19 2:58 PM, David Marchand wrote:
>      > On Mon, Apr 8, 2019 at 3:39 PM Burakov, Anatoly
>      > <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>
>     <mailto:anatoly.burakov at intel.com
>     <mailto:anatoly.burakov at intel.com>>> wrote:
>      >
>      >     As a concrete proposal, my number one dream would be to see
>      >     multiprocess
>      >     gone. I also recall desire for "DPDK to be more lightweight",
>     and i
>      >     maintain that DPDK *cannot* be lightweight if we are to support
>      >     multiprocess - we can have one or the other, but not both.
>     However,
>      >     realistically, i don't think dropping multiprocess is ever
>     going to
>      >     happen - not only it is too entrenched in DPDK use cases, it is
>      >     actually
>      >     quite useful despite its flaws.
>      >
>      >
>      > Well, honestly, I'd like to hear about this.
>      > What are the real usecases for multi process support?
>      > Do we have even a single opensource project that uses it?
>      >
> 
>     I'm aware of a few closed source usages of multiprocess. I also think
>     current versions of collectd rely on secondary process (there's been a
>     Telemetry API added to avoid that, but AFAIK the support for Telemetry
>     is not upstream in collectd yet), and so do/would any dump-style
>     applications - in fact, we ourselves include one such application in
>     our
>     codebase (pdump, proc-info, etc.).
> 
> 
> Sorry, I don't want to highjack this thread, I can start a separate 
> thread if people feel like it.
> If we go with stabilisation, we must be careful that we want to support 
> the features.
> 
> So about multiprocess, again, in those closed source projects you know 
> of, what are the usecases?
> 
> For what we provide in dpdk pdump, proc-info, referring to oneself is 
> not that convincing to me as I don't use those tools.
> 
> I don't see what we could not achieve the same with a control thread 
> running in the dpdk process and handling commands.
> It would be open to the outside via a more standard channel, like a UNIX 
> socket or something like this.
> If we need to declare a dynamic channel, it can be constructed as an 
> extension of the existing standard channel: we can open something like a 
> POSIX shm and push things in it.
> Was this explored ?

There are certainly things that we can do that can make some aspects of 
multiprocess redundant. For example, for any kind of collectd-like 
scenario, the Telemetry API (or Keith's DFS, or...) could conceivably 
provide a better and more maintainable way of doing things.

Our multiprocess also makes it easier to write pipeline/load-balancing 
type applications. To see an example, look at our 
multiprocess/client-server example. This is demonstrating how, instead 
of writing one big monolithic application, one could instead write a 
number of smaller applications each doing their thing. It is of course 
possible to do the same without multiprocess, as evidenced by our sample 
applications such as load-balancer, distributor, ip-pipeline etc., but 
it is arguably easier to implement *real* applications that way due to 
separation of concerns and more focused codebase.

However, there are two use cases that i can think of that are either 
hard or outright not possible without our multiprocess API's. The first 
one is dumping functionality. For example, dpdk_proc_info can display 
info from a currently-running or defunct process - list its 
memzones/mempools/etc. - basically, everything there is to know about 
the shared memory can be known that way. While this isn't a "real" use 
case, it is useful for debugging.

More importantly, our multiprocess model provides resilience. In an 
event of a crash, the entire application is not brought down - instead, 
only the crashed process goes down. It's not /perfect/ resilience, of 
course, and there are caveats (memory leaking, locks, etc.), but you do 
get /some/ resilience that way - your process went down, you spin 
another secondary and you're back up and running again.

The above described scenario is how most people (that i know of) appear 
to be using multiprocess - some kind of "crash-resilient" 
load-balancing/pipelining app.

> 
> 
> -- 
> David Marchand


-- 
Thanks,
Anatoly


More information about the dev mailing list