[dpdk-dev] [PATCH v2 3/4] examples: example showing use of callbacks.

Neil Horman nhorman at tuxdriver.com
Tue Feb 17 17:08:10 CET 2015


On Tue, Feb 17, 2015 at 04:00:56PM +0000, Bruce Richardson wrote:
> On Tue, Feb 17, 2015 at 10:49:24AM -0500, Neil Horman wrote:
> > On Tue, Feb 17, 2015 at 01:50:58PM +0000, Bruce Richardson wrote:
> > > On Tue, Feb 17, 2015 at 02:28:02PM +0100, Olivier MATZ wrote:
> > > > Hi Bruce,
> > > > 
> > > > On 02/17/2015 01:25 PM, Bruce Richardson wrote:
> > > > >On Mon, Feb 16, 2015 at 06:34:37PM +0100, Thomas Monjalon wrote:
> > > > >>2015-02-16 15:16, Bruce Richardson:
> > > > >>>In this specific instance, given that the application does little else, there
> > > > >>>is no real advantage to using the callbacks - it's just to have a simple example
> > > > >>>of how they can be used.
> > > > >>>
> > > > >>>Where callbacks are really designed to be useful, is for extending or augmenting
> > > > >>>hardware capabilities. Taking the example of sequence numbers - to use the most
> > > > >>>trivial example - an application could be written to take advantage of sequence
> > > > >>>numbers written to packets by the hardware which received them. However, if such
> > > > >>>an application was to be used with a NIC which does not provide sequence numbering
> > > > >>>capability, for example, anything using ixgbe driver, the application writer has
> > > > >>>two choices - either modify his application code to check each packet for
> > > > >>>a sequence number in the data path, and add it there post-rx, or alternatively,
> > > > >>>to check the NIC capabilities at initialization time, and add a callback there
> > > > >>>at initialization, if the hardware does not support it. In the latter case,
> > > > >>>the main packet processing body of the application can be written as though
> > > > >>>hardware always has sequence numbering capability, safe in the knowledge that
> > > > >>>any hardware not supporting it will be back-filled by a software fallback at
> > > > >>>initialization-time.
> > > > >>>
> > > > >>>By the same token, we could also look to extend hardware capabilities. For
> > > > >>>different filtering or hashing capabilities, there can be limits in hardware
> > > > >>>which are far less than what we need to use in software. Again, callbacks will
> > > > >>>allow the data path to be written in a way that is oblivious to the underlying
> > > > >>>hardware limits, because software will transparently fill in the gaps.
> > > > >>>
> > > > >>>Hope this makes the use case clear.
> > > > >>
> > > > >>After thinking more about these callbacks, I realize these callbacks won't
> > > > >>help, as Olivier said.
> > > > >>
> > > > >>With callback,
> > > > >>1/ application checks device capability
> > > > >>2/ application provides hardware emulation as DPDK callback
> > > > >>3/ application forgets previous steps
> > > > >>4/ application calls DPDK Rx
> > > > >>5/ DPDK calls callback (without calling optimization)
> > > > >>
> > > > >>Without callback,
> > > > >>1/ application checks device capability
> > > > >>2/ application provides hardware emulation as internal function
> > > > >>3/ application set an internal device-flag to enable this function
> > > > >>4/ application calls DPDK Rx
> > > > >>5/ application calls the hardware emulation if flag is set
> > > > >>
> > > > >>So the only difference is to keep persistent the device information in
> > > > >>the application instead of storing it as a function pointer in the
> > > > >>DPDK struct.
> > > > >>You can also be faster with this approach: at initialization time,
> > > > >>you can check that your NIC supports the feature and use a specific
> > > > >>mainloop that adds or not the sequence number without any runtime
> > > > >>test.
> > > > >
> > > > >That is assuming that all NICs are equal on your system. It's also assuming
> > > > >that you only have a single point in your application where you call RX or
> > > > >TX burst. In the case where you have a couple of different NICs on the system,
> > > > >or where you want to write an application to take advantage of capabilities of
> > > > >different NICs, the ability to resolve all these difference at initialization
> > > > >time is useful. The main packet handling code can be written with just the
> > > > >processing of packets in mind, rather than having to have a set of branches
> > > > >after each RX burst call, or before each TX burst call, to "smooth out" the
> > > > >different NIC capabilities.
> > > > >
> > > > >As for the option of maintaining different main loops for different NICs with
> > > > >different capabilities - that sounds like a maintenance nightmare to
> > > > >me, due to duplicated code! Callbacks is a far cleaner solution than that IMHO.
> > > > 
> > > > Why not just provide a function like this:
> > > > 
> > > >   rte_do_unsupported_stuff_by_software(m[], m_count, wanted_features,
> > > >   	dev_feature_flags)
> > > > 
> > > > This function can be called (or not) from the application mainloop.
> > > > You don't need to maintain several mainloops (for each device) as
> > > > the specific work will be done depending on the given flags. And the
> > > > applications that do not require these features (most applications?)
> > > > are not penalized at all.
> > > 
> > > Have you measured the performance hit due to this proposed change? In my tests
> > > it's very, very small, even for the fastest vectorized path. If performance is
> > > a real concern, I'm happy enough to have this as a compile-time option so that
> > > those who can't take the small performance hit can avoid it.
> > > 
> > How can you assert performance metrics on a patch like this?  The point of the
> > change is to allow a callback to an application defined function, the contents
> > of which are effectively arbitrary.  Not saying that its the wrong thing to do,
> > but you can't really claim performance is not impacted, because the details of
> > whats executed is outside your purview.
> > Neil
> >
> I think the performance hit being referenced is a hit due to the patch itself
> without any callbacks being in use. (That was certainly my assumption in replying)
> 
I figured it was, but thats still something of a misnomer.  Of course this
change on its own is negligible in its performance impact.  By itself, the
impact is that of a branch that is unlikely to be taken, which is to say almost
zero.  But thats not an actionable number because the only time that performance
is attainable if the user doesn't use it.  Since you're posing a patch that
makes application registered callbacks in a very fast path, I think its
important to state very clearly that these callbacks will have a significant
performance impact that individual applications will have to measure and be
cogniscent of.
Neil

> /Bruce
> 


More information about the dev mailing list