[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

Zhou, Danny danny.zhou at intel.com
Mon Sep 15 21:11:37 CEST 2014


> -----Original Message-----
> From: John W. Linville [mailto:linville at tuxdriver.com]
> Sent: Tuesday, September 16, 2014 1:48 AM
> To: Neil Horman
> Cc: Zhou, Danny; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices
> 
> On Mon, Sep 15, 2014 at 12:22:44PM -0400, Neil Horman wrote:
> > On Mon, Sep 15, 2014 at 03:43:07PM +0000, Zhou, Danny wrote:
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > Sent: Monday, September 15, 2014 11:10 PM
> > > > To: Zhou, Danny
> > > > Cc: John W. Linville; dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices
> > > >
> > > > On Fri, Sep 12, 2014 at 08:35:47PM +0000, Zhou, Danny wrote:
> > > > > > -----Original Message-----
> > > > > > From: John W. Linville [mailto:linville at tuxdriver.com]
> > > > > > Sent: Saturday, September 13, 2014 2:54 AM
> > > > > > To: Zhou, Danny
> > > > > > Cc: dev at dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices
> > > > > >
> > > > > > On Fri, Sep 12, 2014 at 06:31:08PM +0000, Zhou, Danny wrote:
> > > > > > > I am concerned about its performance caused by too many
> > > > > > > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy
> > > > > > > packets to skb, then af_packet copies packets to AF_PACKET buffer
> > > > > > > which are mapped to user space, and then those packets to be copied
> > > > > > > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a
> > > > > > > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet
> > > > > > > copies which brings significant negative performance impact. We
> > > > > > > had a bifurcated driver prototype that can do zero-copy and achieve
> > > > > > > native DPDK performance, but it depends on base driver and AF_PACKET
> > > > > > > code changes in kernel, John R will be presenting it in coming Linux
> > > > > > > Plumbers Conference. Once kernel adopts it, the relevant PMD will be
> > > > > > > submitted to dpdk.org.
> > > > > >
> > > > > > Admittedly, this is not as good a performer as most of the existing
> > > > > > PMDs.  It serves a different purpose, afterall.  FWIW, you did
> > > > > > previously indicate that it performed better than the pcap-based PMD.
> > > > >
> > > > > Yes, slightly higher but makes no big difference.
> > > > >
> > > > Do you have numbers for this?  It seems to me faster is faster as long as its
> > > > statistically significant.  Even if its not, johns AF_PACKET pmd has the ability
> > > > to scale to multple cpus more easily than the pcap pmd, as it can make use of
> > > > the AF_PACKET fanout feature.
> > >
> > > For 64B small packet, 1.35M pps with 1 queue.
> > Why did you only test with a single queue?  Multiqueue operation was one of the
> > big advantages of the AF_PACKET based pmd.  I would expect a single queue setup
> > to perform in a very simmilar fashion to the pcap PMD
> >
> >  As both pcap and AF_PACKET PMDs depend on interrupt
> > > based NIC kernel drivers, all the DPDK performance optimization techniques are not utilized. Why should DPDK adopt
> > > two similar and poor performant PMDs which cannot demonstrate DPDK' key value "high performance"?
> > Several reasons:
> > * "High performance" isn't always the key need for end users.  Consider
> > pre-hardware availablity development phase.
> >
> > * Better hardware modeling (consider AF_PACKETS multiqueue abiltiy)
> >
> > * Better scaling (pcap doesn't make use of the fanout features that AF_PACKET
> > does)
> >
> > * Space savings, Building the AF_PACKET pmd doesn't require the additional
> > building/storage of the pcap driver.
> 
> This would include not requiring a dependency on libpcap, if nothing else.

librte_pmd_pcap and librte_pmd_packet are both DPDK wrapper libraries on top of libpcap library and AF_PACKET module respectively, 
so they are not born for high performance, which is truly understandable. DPDK is moving toward to open to a larger public of data center
consumers who do not care about very high performance, so from that angle, it makes sense to adopt librte_pmd_packet in my mind.

> 
> > >
> > > >
> > > > > > I look forward to seeing the changes you mention -- they sound very
> > > > > > exciting.  But, they will still require both networking core and
> > > > > > driver changes in the kernel.  And as I understand things today,
> > > > > > the userland code will still need at least some knowledge of specific
> > > > > > devices and how they layout their packet descriptors, etc.  So while
> > > > > > those changes sound very promising, they will still have certain
> > > > > > drawbacks in common with the current situation.
> > > > >
> > > > > Yes, we would like the DPDK performance optimization techniques such as huge page, efficient rx/tx routines to manipulate
> > > > device-specific
> > > > > packet descriptors, polling-model can be still used. We have to tradeoff between performance and commonality. But we believe
> it will
> > > > be much easier
> > > > > to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to DPDK.
> > > > >
> > > >
> > > > Not sure how this relates, what you're describing is the feature intel has been
> > > > working on to augment kernel drivers to provide better throughput via direct
> > > > hardware access to user space.  Johns PMD provides ubiquitous function on all
> > > > hardware. I'm not sure how the desire for one implies the other isn't valuable?
> > > >
> > >
> > > Performance is the key value of DPDK, instead of commonality. But we are trying to improve commonality of our solution to make
> it easily
> > > adopted by other NIC vendors.
> > >
> > Thats completely irrelevant to the question at hand.  To go with your reasoning,
> > if performance is the key value of the DPDK, then you should remove all driver
> > support save for the most performant hardware you have.  By that same token,
> > you should deprecate the pcap driver in favor of this AF_PACKET driver, because
> > it has shown performance improvement.
> >
> > I'm being facetious, of course, but the facts remain: Lack of superior
> > performance from one PMD to the next does not immediately obviate the need for
> > one PMD over another, as they quite likely address differing needs.  As you note
> > the DPDK seeks performance as a key goal, but its an open source project, there
> > are other needs from other users in play here.  The AF_PACKET pmd provides
> > superior performance on linux platforms when hardware independence is required.
> > It differs from the pcap PMD as it uses features that are only available on the
> > Linux platform, so it stands to reason we should have both.
> 
> IMHO, the biggest deficiency in DPDK is the lack of apps.  Let's face
> it, no one really cares about running l2fwd except for testing the
> drivers.  What people want is applications.  Providing a PMD to use
> while developing an app without requiring specific hardware seems like
> a win to me.  The pcap PMD addresses some of that, but it is more of
> a stop-gap or special purpose thing (like for playing back captures).
> 

It is not true for network middle boxes which resolve L2/L3 packet processing problems(which is the main problem DPDK wants to resolve when it was born), 
but it might be truefor data center or endpoint applications that primarily focus on addressing L4-L7 packet processing problems, which
do not care about L2/L3 high throughput and packet latency very much, as system performance bottle-neck are in the L4-L7 routines.

> > > > > > It seems like the changes you mention will still need some sort of
> > > > > > AF_PACKET-based PMD driver.  Have you implemented that completely
> > > > > > separate from the code I already posted?  Or did you add that work
> > > > > > on top of mine?
> > > > > >
> > > > >
> > > > > For userland code, it certainly use some of your code related to raw rocket, but highly modified. A layer will be added into
> eth_dev
> > > > library to do device
> > > > > probe and support new socket options.
> > > > >
> > > >
> > > > Ok, but again, PMD's are independent, and serve different needs.  If they're use
> > > > is at all overlapping from a functional standpoint, take this one now, and
> > > > deprecate it when a better one comes along.  Though from your description it
> > > > seems like both have a valid place in the ecosystem.
> > > >
> > >
> > > I am ok with this approach, as long as this AF_PACKET PMD does not add extra maintain efforts. Thomas might make the call.
> > >
> > What extra maintainer efforts do you think are required here, that wouldn't be
> > required for any PMD?  To suggest that a given PMD shouldn't be included because
> > it would require additional effort to maintain holds it to a higher standard
> > than the PMD's already included.  I don't recall anyone asking if the i40e or
> > bonding pmds would require additional effort before being integrated.
> 
> Right -- how much maintainer effort is put into the pcap driver
> these days?

I do not know details, but I DO know validation guys need to put a lot efforts on measuring the performance for it on different platforms.
Probably a automation function and performance testsuite can help a lot.

> 
> John
> --
> John W. Linville		Someday the world will need a hero, and you
> linville at tuxdriver.com			might be all we have.  Be ready.


More information about the dev mailing list