[dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior

Yang, Zhiyong zhiyong.yang at intel.com
Sat Jan 21 05:07:16 CET 2017



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Friday, January 20, 2017 7:25 PM
> To: Andrew Rybchenko <arybchenko at solarflare.com>; Yang, Zhiyong
> <zhiyong.yang at intel.com>; dev at dpdk.org
> Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> <bruce.richardson at intel.com>
> Subject: RE: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching
> behavior
> 
> >
> > From: Andrew Rybchenko [mailto:arybchenko at solarflare.com]
> > Sent: Friday, January 20, 2017 10:26 AM
> > To: Yang, Zhiyong <zhiyong.yang at intel.com>; dev at dpdk.org
> > Cc: thomas.monjalon at 6wind.com; Richardson, Bruce
> > <bruce.richardson at intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev at intel.com>
> > Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD
> > batching behavior
> >
> > On 01/20/2017 12:51 PM, Zhiyong Yang wrote:
> > The rte_eth_tx_burst() function in the file Rte_ethdev.h is invoked to
> > transmit output packets on the output queue for DPDK applications as
> > follows.
> >
> > static inline uint16_t
> > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> >                  struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> >
> > Note: The fourth parameter nb_pkts: The number of packets to transmit.
> > The rte_eth_tx_burst() function returns the number of packets it
> > actually sent. The return value equal to *nb_pkts* means that all
> > packets have been sent, and this is likely to signify that other
> > output packets could be immediately transmitted again. Applications
> > that implement a "send as many packets to transmit as possible" policy
> > can check this specific case and keep invoking the rte_eth_tx_burst()
> > function until a value less than
> > *nb_pkts* is returned.
> >
> > When you call TX only once in rte_eth_tx_burst, you may get different
> > behaviors from different PMDs. One problem that every DPDK user has to
> > face is that they need to take the policy into consideration at the
> > app- lication level when using any specific PMD to send the packets
> > whether or not it is necessary, which brings usage complexities and
> > makes DPDK users easily confused since they have to learn the details
> > on TX function limit of specific PMDs and have to handle the different
> > return value: the number of packets transmitted successfully for
> > various PMDs. Some PMDs Tx func- tions have a limit of sending at most
> > 32 packets for every invoking, some PMDs have another limit of at most
> > 64 packets once, another ones have imp- lemented to send as many
> > packets to transmit as possible, etc. This will easily cause wrong usage for
> DPDK users.
> >
> > This patch proposes to implement the above policy in DPDK lib in order
> > to simplify the application implementation and avoid the incorrect
> > invoking as well. So, DPDK Users don't need to consider the
> > implementation policy and to write duplicated code at the application
> > level again when sending packets. In addition to it, the users don't
> > need to know the difference of specific PMD TX and can transmit the
> > arbitrary number of packets as they expect when invoking TX API
> > rte_eth_tx_burst, then check the return value to get the number of
> packets actually sent.
> >
> > How to implement the policy in DPDK lib? Two solutions are proposed
> below.
> >
> > Solution 1:
> > Implement the wrapper functions to remove some limits for each
> > specific PMDs as i40e_xmit_pkts_simple and ixgbe_xmit_pkts_simple do
> like that.
> >
> > > IMHO, the solution is a bit better since it:
> > > 1. Does not affect other PMDs at all
> > > 2. Could be a bit faster for the PMDs which require it since has no
> > >indirect
> > >    function call on each iteration
> > > 3. No ABI change
> 
> I also would prefer solution number 1 for the reasons outlined by Andrew
> above.
> Also, IMO current limitation for number of packets to TX in some Intel PMD
> TX routines are sort of artificial:
> - they are not caused by any real HW limitations
> - avoiding them at PMD level shouldn't cause any performance or functional
> degradation.
> So I don't see any good reason why instead of fixing these limitations in our
> own PMDs we are trying to push them to the upper (rte_ethdev) layer.
> 
> Konstantin

 Solution 1 indeed has advantages as Andrew and Konstantin said. 

Zhiyong 


More information about the dev mailing list