[dpdk-dev] [RFC] ethdev: rte_eth_rx_burst() requirements for nb_pkts

Morten Brørup mb at smartsharesystems.com
Thu Aug 27 11:31:15 CEST 2020


> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> Sent: Thursday, August 27, 2020 11:10 AM
> 
> On Thu, Aug 27, 2020 at 10:40:11AM +0200, Morten Brørup wrote:
> > Jeff and Ethernet API maintainers Thomas, Ferruh and Andrew,
> >
> > I'm hijacking this patch thread to propose a small API modification
> that prevents unnecessarily performance degradations.
> >
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jeff Guo
> > > Sent: Thursday, August 27, 2020 9:55 AM
> > >
> > > The limitation of burst size in vector rx was removed, since it
> should
> > > retrieve as much received packets as possible. And also the
> scattered
> > > receive path should use a wrapper function to achieve the goal of
> > > burst maximizing.
> > >
> > > This patch set aims to maximize vector rx burst for for
> > > ixgbe/i40e/ice/iavf PMDs.
> > >
> >
> > Now I'm going to be pedantic and say that it still doesn't conform to
> the rte_eth_rx_burst() API, because the API does not specify any
> minimum requirement for nb_pkts.
> >
> > In theory, that could also be fixed in the driver by calling the non-
> vector function from the vector functions if nb_pkts is too small for
> the vector implementation.
> >
> > However, I think that calling rte_eth_rx_burst() with a small nb_pkts
> is silly and not in the spirit of DPDK, and introducing an additional
> comparison for a small nb_pkts in the driver vector functions would
> degrade their performance (only slightly, but anyway).
> >
> 
> Actually, I'd like to see a confirmed measurement showing a slowdown
> before
> we discard such an option. :-)

Good point!

> While I agree that using small bursts is
> not
> keeping with the design approach of DPDK of using large bursts to
> amortize
> costs and allow prefetching, there are cases where a user/app may want
> a
> small burst size, e.g. 4, for latency reasons, and we need a way to
> support
> that.
> 
I assume that calling rte_eth_rx_burst() with nb_pkts=32 returns 4 packets if only 4 packets are available, so you would need to be extremely latency sensitive to call it with a smaller nb_pkts. I guess that high frequency trading is the only real life scenario here.

> Since the path selection is dynamic, we need to either:
> a) provide a way for the user to specify that they will use smaller
> bursts
> and so that vector functions should not be used
> b) have the vector functions transparently fallback to the scalar ones
> if
> used with smaller bursts
> 
> Of these, option b) is simpler, and should be low cost since any check
> is
> just once per burst, and - assuming an app is written using the same
> request-size each time - should be entirely predictable after the first
> call.
> 
Why does everyone assume that DPDK applications are so simple that the branch predictor will cover the entire data path? I hear this argument over and over again, and by principle I disagree with it!

How about c): add rte_eth_rx() and rte_eth_tx() functions for receiving/transmitting a single packet. The ring library has such functions.

Optimized single-packet functions might even perform better than calling the burst functions with nb_pkts=1. Great for latency focused applications. :-)

> /Bruce
> 



More information about the dev mailing list