[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features

Neil Horman nhorman at tuxdriver.com
Fri Aug 1 22:43:52 CEST 2014

On Fri, Aug 01, 2014 at 12:22:22PM -0700, Bruce Richardson wrote:
> On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote:
> > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote:
> > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote:
> > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote:
> > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote:
> > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote:
> > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote:
> > > > > > > > 2014-07-31 09:13, Neil Horman:
> > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson wrote:
> > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote:
> > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson wrote:
> > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote:
> > > > > > > > > > > > > Hey all-
> > > > > 
> > > > > With regards to the general approach for runtime detection of software
> > > > > functions, I wonder if something like this can be handled by the
> > > > > packaging system? Is it possible to ship out a set of shared libs
> > > > > compiled up for different instruction sets, and then at rpm install
> > > > > time, symlink the appropriate library? This would push the whole issue
> > > > > of detection of code paths outside of code, work across all our
> > > > > libraries and ensure each user got the best performance they could get
> > > > > form a binary?
> > > > > Has something like this been done before? The building of all the
> > > > > libraries could be scripted easy enough, just do multiple builds using
> > > > > different EXTRA_CFLAGS each time, and move and rename the .so's after
> > > > > each run.
> > > > > 
> > > > 
> > > > Sorry, I missed this in my last reply.
> > > > 
> > > > In answer to your question, the short version is that such a thing is roughly
> > > > possible from a packaging standpoint, but completely unworkable from a
> > > > distribution standpoint.  We could certainly build the dpdk multiple times and
> > > > rename all the shared objects to some variant name representative of the
> > > > optimzations we build in for certain cpu flags, but then we woudl be shipping X
> > > > versions of the dpdk, and any appilcation (say OVS that made use of the dpdk
> > > > would need to provide a version linked against each variant to be useful when
> > > > making a product, and each end user would need to manually select (or run a
> > > > script to select) which variant is most optimized for the system at hand.  Its
> > > > just not a reasonable way to package a library.
> > > 
> > > Sorry, perhaps I was not clear, having the user have to select the
> > > appropriate library was not what I was suggesting. Instead, I was
> > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic",
> > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm
> > > post-install script would look at the cpuflags in cpuinfo and then
> > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user
> > > only has to link against "librte_pmd_ixgbe.so" and depending on the
> > > system its run on, the loader will automatically resolve the symbols
> > > from the appropriate instruction-set specific .so file.
> > > 
> > 
> > This is an absolute packaging nightmare, it will potentially break all sorts of
> > corner cases, and support processes.  To cite a few examples:
> > 
> > 1) Upgrade support - What if the minimum cpu requirements for dpdk are advanced
> > at some point in the future?  The above strategy has no way to know that a given
> > update has more advanced requirements than a previous update, and when the
> > update is installed, the previously linked library for the old base will
> > dissappear, leaving broken applications behind.
> Firstly, I didn't know we could actually specify minimum cpu
> requirements for packaging, that is something that could be useful :-)
You misread my comment :).  I didn't say we could specify minimum cpu
requirements at packaging (you can't, beyond general arch), I said "what if the
dpdk's cpu requriements were raised?".  Completely different thing.  Currently
teh default, lowest common denominator system that dpdk appears to build for is
core2 (as listed in the old default config).  What if at some point you raise
those requirements and decide that SSE4.2 really is required to achieve maximum
performance. Using the above strategy any system that doesn't meet the new
requirements will silently break on such an update.  Thats not acceptable.

> Secondly, what is the normal case for handling something like this,
> where an upgrade has enhanced requirements compared to the previous
> version? Presumably you either need to prevent the upgrade from
> happening or else accept a broken app. Can the same mechanism not also
> be used to prevent upgrades using a multi-lib scheme?
The case for handling something like this is: Don't do it.  When you package
something for Fedora (or any distro), you provide an implicit guaratee that it
will run (or fail gracefully) on all supported systems.  You can add support for
systems as you go forward, but you can't deprecate support for systems within a
major release.  That is to say, if something runs on F20 now, its got to keep
running on F20 for the lifetime of F20.  If it stops running, thats a
regression, the user opens a bug and you fix it.

The DPDK is way off the reservation in regards to this.  Application packages,
as a general rule don't build with specific cpu features in mind, because
performance, while important isn't on the same scale as what you're trying to do
in the dpdk.  A process getting scheduled off the cpu while we handle an
interrupt wipes out any speedup gains made by any micro-optimizations, so theres
no point in doing so.  The DPDK is different, I understand that, but the
drawback is that it (the DPDK) needs to make optimizations that really aren't
considered particularly important to the rest of user space.  I'm trying to
opportunistically make the DPDK as fast as possible, but I need to do it in a
single binary, that works on a lowest common demoninator system.

> > 
> > 2) Debugging - Its going to be near impossible to support an application built
> > with a package put together this way, because you'll never be sure as to which
> > version of the library was running when the crash occured.  You can figure it
> > out for certain, but for support/development people to need to remember to
> > figure this out is going to be a major turn off for them, and the result will be
> > that they simply won't use the dpdk.  Its Anathema to the expectations of linux
> > user space.
> Sorry, I just don't see this as being any harder to support than
> multiple code paths for the same functionality. In fact, it will surely make
> debugging easier, since you only have the one code path, just compiled
> up in different ways.

Well, then by all means, become a Fedora packager, an you can take over the DPDK
maintenece there :).  Until then, you'll just have to trust me.  If you have
multiple optional code paths (especialy if they're limited to isolated features) 
its manageable.  But regardless of how you look at it, building the same
source multiple times with different cpu support means completely different
binaries.  The assembly and optimization are just plain different.  They may be
close, but they're not the same, and they need to be QA-ed independently.  With
a single build and optional code paths, all the common code is executed no
matter what system you're running on, and its always the same. Multiple builds
with different instruction support means that code that is identical at a source
level may well be significantly different at a binary level, and thats not
something I can sanely manage in a general purpose environment.

> > 3) QA - Building multiple versions of a library means needing to QA multiple
> > versions of a library.  If you have to have 4 builds to support different levels
> > of optimization, you've created a 4x increase in the amount of testing you need
> > to do to ensure consistent behavior.  You need to be aware of how many different
> > builds are available in the single rpm at all times, and find systems on which
> > to QA which will ensure that all of the builds get tested (as they are in fact,
> > unique builds).  While you may not hit all code paths in a single build, you
> > will at least test all the common paths.
> Again, the exact same QA conditions will also apply to an approach using
> multiple code paths bundled into the same library. Given a choice
> between one code path with multiple compiles, vs multiple code paths
> each compiled only once, the multiple code paths option leaves far
> greater scope for bugs, and when bugs do occur means that you always
> have to find out what specific hardware it was being run on. Using the
> exact same code multiply compiled, the vast, vast majority of bugs are
> going to occur across all platforms and systems so you should rarely
> need to ask what the specific platform being used is.

No, they won't (see above).  Enabling insructions will enable the compiler to
emit and optimize common paths differently, so identical source code will lead
to different binary code.  I need to have a single binary so that I know what
I'm working with when someone opens a bug.  I don't have that using a multiple
binary approach.  At least with multiple runtime paths (especially/specifically
with the run time paths we've been discussing, the igbe rx vector path and the
acl library, which are isolated), I know that, if I get a bug report and the
backtrace ends in either location, I know I'm sepecifically dealing with that
code.  With your multiple binary approach, if I get a crash in, say
rte_eal_init, I need to figure out if this crash happened in the sse3 compiled
binary, the ss4.2 compiled binary, the avx binary, the avx512 binary, or the
core2 binary.  You can say thats easy, but its easy to say that when you're not
the one that has to support it.

> > 
> > The bottom line is that Distribution packaging is all about consistency and
> > commonality.  If you install something for an arch on multiple systems, its the
> > same thing on each system, and it works in the same way, all the time.  This
> > strategy breaks that.  Thats why we do run time checks for things.
> If you want to have the best tuned code running for each instruction
> set, then commonality and consistency goes out the window anyway,
So, this is perhaps where communication is breaking down.  I don't want to have the
best tuned code running for each instruction set.  What I want is for the dpdk
to run on a lowest common denominator platform, and be able to opportunistically
take advantage of accelerated code paths that require advanced cpu featrues.

Lets take the ixgbe code as an example. Note I didn't add any code paths there,
at all (in fact I didn't add any anywhere).  The ixgbe rx_burst method gets set
according to compile time configuration.  You can pick the bulk_alloc rx method,
or the vectorized rx method at compile time (or some others I think, but thats
not relevant). As it happened the vectorized rx path option had an implicit
dependency on SSE4.2.  Instead of requiring that all cpus that run the dpdk have
SSE4.2, I instead chose to move that compile time decision to a run tmie
decision, by building only the vectorized path with sse4.2 and only using it if
we see that the cpu supports sse4.2 at run time.  No new paths created, no new
support requirements, you're still supporting the same options upstream, the
only difference is I was able to include them both in a single binary.  Thats
better for our end users because the single binary still works everywhere.
Thats better for our QA group because For whatever set of tests they perform,
they only need an sse4.2 enabled system to test the one isolated path for that
vector rx code.  The rest of their tests can be conducted once, on any system,
because the binary is exactly the same.  If we compile multiple binaries,
testing on one system doesn't mean we've tested all the code.

> because two different machines calling the same function are going to
> execute different sets of instructions. The decision then becomes:
But thats not at all what I wanted.  I want two different machines calling the
same function to execute the same instructions 99.9999% of the time.  The only
time I want to diverge from that is in isolated paths where we can take
advantage of a feature that we otherwise could not (i.e. the ixgbe and acl
code).  I look at it like the alternatives code in linux.  There are these
isolated areas where you have limited bits of code that at run time are
re-written to use available cpu features.  99.9% of the code is identical, but
in these little spots its ok to diverge from simmilarity because they'er isolated,
easily identifiable

> a) whether you need multiple sets of instructions - if no then you pay
> with lack of performance
> b) how you get those multiple sets of instructions
> c) how you validate those multiple sets of instructions.
> As is clear by now :-), my preference by far is to have multiple sets of
> instructions come from a single code base, as less code means less
> maintenance, and above all, fewer bugs. If that can't be done, then we
> need to look carefully at each code path being added and do a
> cost-benefit analysis on it.

Yes, its quite clear :), I think its equally clear that I need a single binary,
and would like to opportunisitcally enhance it where possible without losing the
fact that its a single binary.

I suppose its all somewhat moot at this point though,  The reduction to sse3 for
ixgbe seems agreeable to everyone, and it lets me preserve single binary builds
there.  I'm currently working on the ACL library, as you noted thats a tougher
nut to crack.  I think I'll have it done early next week (though i'm sure my
translation of the instruction set reference to C will need some through testing
:)).  I'll post it when its ready.

> Regards,
> /Bruce
> > 
> > Neil
> > 
> > > > 
> > > > When pacaging software, the only consideration given to code variance at pacakge
> > > > time is architecture (x86/x86_64/ppc/s390/etc).  If you install a package for
> > > > your a given architecture, its expected to run on that architecture.  Optional
> > > > code paths are just that, optional, and executed based on run time tests.  Its a
> > > > requirement that we build for the lowest common demoniator system that is
> > > > supported, and enable accelerative code paths optionally at run time when the
> > > > cpu indicates support for them.
> > > > 
> > > > Neil
> > > > 
> > > 

More information about the dev mailing list