[dpdk-dev] [RFC] Adding multiple device types to DPDK.
    Neil Horman 
    nhorman at tuxdriver.com
       
    Mon Apr  6 03:48:20 CEST 2015
    
    
  
On Sun, Apr 05, 2015 at 10:20:10PM +0000, Wiles, Keith wrote:
> 
> 
> On 4/5/15, 2:37 PM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
> 
> >On Sat, Apr 04, 2015 at 03:16:08PM +0000, Wiles, Keith wrote:
> >> 
> >> 
> >> On 4/4/15, 8:11 AM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
> >> 
> >> >On Fri, Apr 03, 2015 at 10:32:01PM +0000, Wiles, Keith wrote:
> >> >> Hi Neil,
> >> >> 
> >> >> On 4/3/15, 12:00 PM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
> >> >> 
> >> >> >On Wed, Apr 01, 2015 at 12:44:54PM +0000, Wiles, Keith wrote:
> >> >> >> Hi all, (hoping format of the text is maintained)
> >> >> >> 
> >> >> >> Bruce and myself are submitting this RFC in hopes of providing
> >> >> >>discussion
> >> >> >> points for the idea. Please do not get carried away with the code
> >> >> >> included, it was to help everyone understand the proposal/RFC.
> >> >> >> 
> >> >> >> The RFC is to describe a proposed change we are looking to make to
> >> >>DPDK
> >> >> >>to
> >> >> >> add more device types. We would like to add in to DPDK the idea
> >>of a
> >> >> >> generic packet-device or ?pktdev?, which can be thought of as a
> >>thin
> >> >> >>layer
> >> >> >> for all device classes. For other device types such as
> >>potentially a
> >> >> >> ?cryptodev? or ?dpidev?. One of the main goals is to not effect
> >> >> >> performance and not require any current application to be
> >>modified.
> >> >>The
> >> >> >> pktdev layer is providing a light framework for developers to add
> >>a
> >> >> >>device
> >> >> >> to DPDK.
> >> >> >> 
> >> >> >> Reason for Change
> >> >> >> -----------------
> >> >> >> 
> >> >> >> The reason why we are looking to introduce these concepts to DPDK
> >> >>are:
> >> >> >> 
> >> >> >> * Expand the scope of DPDK so that it can provide APIs for more
> >>than
> >> >> >>just
> >> >> >> packet acquisition and transmission, but also provide APIs that
> >>can
> >> >>be
> >> >> >> used to work with other hardware and software offloads, such as
> >> >> >> cryptographic accelerators, or accelerated libraries for
> >> >>cryptographic
> >> >> >> functions. [The reason why both software and hardware are
> >>mentioned
> >> >>is
> >> >> >>so
> >> >> >> that the same APIs can be used whether or not a hardware
> >>accelerator
> >> >>is
> >> >> >> actually available].
> >> >> >> * Provide a minimal common basis for device abstraction in DPDK,
> >>that
> >> >> >>can
> >> >> >> be used to unify the different types of packet I/O devices already
> >> >> >> existing in DPDK. To this end, the ethdev APIs are a good starting
> >> >> >>point,
> >> >> >> but the ethdev library contains too many functions which are
> >> >> >>NIC-specific
> >> >> >> to be a general-purpose set of APIs across all devices.
> >> >> >>      Note: The idea was previously touched on here:
> >> >> >> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/13545
> >> >> >> 
> >> >> >> Description of Proposed Change
> >> >> >> ------------------------------
> >> >> >> 
> >> >> >> The basic idea behind "pktdev" is to abstract out a few common
> >> >>routines
> >> >> >> and structures/members of structures by starting with ethdev
> >> >>structures
> >> >> >>as
> >> >> >> a starting point, cut it down to little more than a few members in
> >> >>each
> >> >> >> structure then possible add just rx_burst and tx_burst. Then use
> >>the
> >> >> >> structures as a starting point for writing a device type.
> >>Currently
> >> >>we
> >> >> >> have the rx_burst/tx_burst routines moved to the pktdev and it see
> >> >>like
> >> >> >> move a couple more common functions maybe resaonable. It could be
> >>the
> >> >> >> Rx/Tx routines in pktdev should be left as is, but in the code
> >>below
> >> >>is
> >> >> >>a
> >> >> >> possible reason to abstract a few routines into a common set of
> >> >>files.
> >> >> >> 
> >> >> >> >From there, we have the ethdev type which adds in the existing
> >> >> >>functions
> >> >> >> specific to Ethernet devices, and also, for example, a cryptodev
> >> >>which
> >> >> >>may
> >> >> >> add in functions specific for cryptographic offload. As now, with
> >>the
> >> >> >> ethdev, the specific drivers provide concrete implementations of
> >>the
> >> >> >> functionality exposed by the interface. This hierarchy is shown in
> >> >>the
> >> >> >> diagram below, using the existing ethdev and ixgbe drivers as a
> >> >> >>reference,
> >> >> >> alongside a hypothetical cryptodev class and driver implementation
> >> >> >> (catchingly called) "X":
> >> >> >> 
> >> >> >>                     ,---------------------.
> >> >> >>                     | struct rte_pktdev   |
> >> >> >>                     +---------------------+
> >> >> >>                     | rte_pkt_rx_burst()  |
> >> >> >>             .-------| rte_pkt_tx_burst()  |-----------.
> >> >> >>             |       `---------------------'           |
> >> >> >>             |                                         |
> >> >> >>             |                                         |
> >> >> >>   ,-------------------------------.
> >> >>,------------------------------.
> >> >> >>   |    struct rte_ethdev          |    |      struct rte_cryptodev
> >> >> |
> >> >> >>   +-------------------------------+
> >> >>+------------------------------+
> >> >> >>   | rte_eth_dev_configure()       |    |
> >> >>rte_crypto_init_sym_session()|
> >> >> >>   | rte_eth_allmulticast_enable() |    |
> >> >>rte_crypto_del_sym_session() |
> >> >> >>   | rte_eth_filter_ctrl()         |    |
> >> >> |
> >> >> >>   `-------------------------------'
> >> >>`---------------.--------------'
> >> >> >>             |                                          |
> >> >> >>             |                                          |
> >> >> >>   ,---------'---------------------.
> >> >>,---------------'--------------.
> >> >> >>   |    struct rte_pmd_ixgbe       |    |      struct rte_pmd_X
> >> >> |
> >> >> >>   +-------------------------------+
> >> >>+------------------------------+
> >> >> >>   | .configure -> ixgbe_configure |    | .init_session ->
> >> >>X_init_ses()|
> >> >> >>   | .tx_burst  -> ixgbe_xmit_pkts |    | .tx_burst ->
> >> >>X_handle_pkts() |
> >> >> >>   `-------------------------------'
> >> >>`------------------------------'
> >> >> >> 
> >> >> >> We are not attempting to create a real class model here only
> >>looking
> >> >>at
> >> >> >> creating a very basic common set of APIs and structures for other
> >> >>device
> >> >> >> types.
> >> >> >> 
> >> >> >> In terms of code changes for this, we obviously need to add in new
> >> >> >> interface libraries for pktdev and cryptodev. The pktdev library
> >>can
> >> >> >> define a skeleton structure for the first few elements of the
> >>nested
> >> >> >> structures to ensure consistency. Each of the defines below
> >> >>illustrate
> >> >> >>the
> >> >> >> common members in device structures, which gives some basic
> >>structure
> >> >> >>the
> >> >> >> device framework. Each of the defines are placed at the top of the
> >> >> >>devices
> >> >> >> matching structures and allows the devices to contain common and
> >> >>private
> >> >> >> data. The pkdev structures overlay the first common set of members
> >> >>for
> >> >> >> each device type.
> >> >> >> 
> >> >> >
> >> >> >
> >> >> >Keith and I discussed this offline, and for the purposes of
> >> >>completeness
> >> >> >I'll
> >> >> >offer my rebuttal to this proposal here.
> >> >> >
> >> >> >In short, I don't think the segregation of the transmit and receive
> >> >> >routines
> >> >> >into their own separate structure (and ostensibly their own
> >> >>librte_pktdev
> >> >> >library) is particularly valuable.  While it does provide some
> >>minimal
> >> >> >code
> >> >> >savings when new device classes are introduced, the savings are not
> >> >> >significant
> >> >> >(approximlately 0.5kb per device class if the rte_ethdev generic tx
> >> >>and rx
> >> >> >routines are any sort of indicator).  It does however, come with
> >> >> >significant
> >> >> >costs in the sense that it binds a device class to using an I/O
> >>model
> >> >>(in
> >> >> >this
> >> >> >case packet based recieve and transmit) for which the device class
> >>may
> >> >> >not be
> >> >> >suited.
> >> >> 
> >> >> The only reason the we only have a 0.5Kb saving is you are only
> >>looking
> >> >>at
> >> >> moving Rx/Tx routines into pktdev, but what happens if we decide to
> >> >>move a
> >> >> number of common functions like start/stop and others, then you
> >>start to
> >> >> see a much bigger saving.
> >> >You're right of course, but lets just try the math here.  If we assume
> >> >that,
> >> >since all these libraries are just inlined redirection functions, we
> >>can
> >> >guess
> >> >that each one is about .25kb of code (5.kb/2 functions).  So if you add
> >> >two more
> >> >functions you're looking at 1kb of code savings.  Of course if you
> >>start
> >> >doing
> >> >so, you beg the two questions that I've been trying to pose here:
> >> >
> >> >1) How much 'common' API do you want to impose on a device class that
> >>may
> >> >not
> >> >be condusive to the common set
> >> >
> >> >2) At what point does your common API just become an rte_ethdev?
> >> 
> >> You are taking the code size to the extreme, which is not the real point
> >> here. I agree it does not matter in footprint savings as most DPDK
> >>systems
> >> are pretty big. The reason for the common code was to help remove some
> >>of
> >> the code required for the developer to write in the new device type. If
> >> the developer of the new device type writes his own set of Rx/Tx
> >>routines
> >> the community will determine if they are required or do the current APIs
> >> work or do we require them to support both.
> >> 
> >> Please try not getting hung up on the footprint savings and I am sorry
> >>if
> >> I was making this a huge point.
> >> 
> >Hung up?  I'm not the one who made a big deal out of code size reduction
> >being a
> >major benefit there, be it source or binary, its miniscule any way you
> >slice it.
> >I'm happy to drop this though if you are.
> >
> >> > 
> >> >> Do we need this saving, maybe not, but it does
> >> >I would say definately not, given that the DPDK houses entire API sets
> >> >that are
> >> >always-inline, and any savings from the above are easily eclipsed by
> >>the
> >> >fanout
> >> >that occurs from use in multiple call sites.  Lets be honest,
> >>performance
> >> >is the
> >> >DPDK's primary concern.  Code size is not a consideration.  Its only an
> >> >argument
> >> >here because it makes this proposal look favorable.  Its not a bad
> >>thing
> >> >mind
> >> >you, but if this proposal caused any code bloat, it wouldn't even be
> >> >mentioned.
> >> 
> >> In the current design the pktdev APIs are still inline routines only the
> >> debug routines are not.
> >> 
> >Reread my comments above please.  You seem to have misread what I wrote.
> >
> >> >
> >> >> provide a single call API rte_pkt_rx/tx_burst to use instead of the
> >> >> application having to make sure it is calling the correct device
> >>Rx/Tx
> >> >> routines.
> >> >Given that that is a compile time issue, I really don't see why that
> >>is a
> >> >big
> >> >deal.  Especially if you, as I suggest, just use the rte_ethdev as your
> >> >common
> >> >dataplane device model.  Then by virtue of the fact that they're all
> >> >rte_ethdevs, you get common API naming.
> >> >
> >> >> All that is required is passing in the device pointer and it is
> >> >> handled for the application. Bruce added some code below to that
> >>effect.
> >> >Yes, and you have that now, its an rte_ethdev.
> >> >
> >> >> >
> >> >> >To illustrate the difference in design ideas, currenty the dpdk data
> >> >> >pipeline
> >> >> >looks like this:
> >> >> >
> >> >> >+------------+   +----------+   +---------+
> >> >> >|            |   |          |   |         |
> >> >> >|  ARP       |   |  ethdev  |   |         |   +----------+
> >> >> >|  handler   +-->+  api     +-->+  PMD    +-->+ Wire     |
> >> >> >|            |   |          |   |         |   +----------+
> >> >> >|            |   |          |   |         |
> >> >> >+------------+   +----------+   +---------+
> >> >> 
> >> >> You did not add the crypto to this picture as it is in the picture
> >>below
> >> >> to make them the same.
> >> >> 
> >> >No I didn't, because it doesn't exist right now.  Thats what I mean by
> >> >'currently'.  See my description below after you added your drawing.
> >> >
> >> >> +-----+  +---------+  +---------+  +------+
> >> >> |     |  |         |  |         |  |      |  +------+
> >> >> | ARP +--+ ethdev  +--+ crypto  +--+ PMD  +--+ Wire |
> >> >> |     |  |         |  |         |  |      |  +------+
> >> >> +-----+  +---------+  +---------+  +------+
> >> >> 
> >> >> 
> >> >> >
> >> >> >
> >> >> >Where the ARP handler code is just some code that knows how to
> >>manage
> >> >>arp
> >> >> >requests and responses, and only transmits and receives frames
> >> >> >
> >> >> >Keiths idea would introduce this new pktdev handler structure and
> >>make
> >> >>the
> >> >> >dataplane pipeline look like this:
> >> >> >
> >> >> >+------------+ +------------+  +------------+  +--------+
> >> >> >|            | |            |  |            |  |        |
> >> >> >|  ARP       | | pktdev api |  | pktdev_api |  |        |
> >>+---------+
> >> >> >|  handler   +-+            +--+            +--+ PMD    +--+Wire
> >> |
> >> >> >|            | |            |  |            |  |        |
> >>+---------+
> >> >> >|            | |            |  |            |  |        |
> >> >> >+------------+ |            |  |            |  |        |
> >> >> >               |            |  |            |  +--------+
> >> >> >               |            |  |            |
> >> >> >               |            |  |            |
> >> >> >               |            |  |            |
> >> >> >               | rte_ethdev |  | rte_crypto |
> >> >> >               |            |  |            |
> >> >> >               |            |  |            |
> >> >> >               +------------+  +------------+
> >> >> 
> >> >> You are drawing this picture it appears trying to make the pktdev
> >> >>another
> >> >> function call layer when it is just a single macro in the rte_ethdev
> >> >Not trying to imply a new function call layer, just trying to
> >>illustrate
> >> >your
> >> >proposal, that you wish to make rte_ethdevs,and rte_cryptodevs all look
> >> >like
> >> >rte_pktdevs so that the dataplane has a common element for passing
> >>mbufs
> >> >around,
> >> >regardless of its true device class.  Not sure where you see an extra
> >> >function
> >> >call layer here.
> >> 
> >> Please see my comment above as I am not trying to make them all look the
> >> same you are taking my words to the extreme. We need some type of
> >> structure for adding more device types to DPDK and I am just trying to
> >> suggest a few common parts as a solution.
> >> 
> >You see that you're contradicting yourself here, right?  You state in the
> >same
> >paragraph that you are not trying to make all devices look the same, but
> >are
> >trying to find comment parts (the implication being that they will be
> >interacted
> >with in the same way).  I get that you're saying their only going to be
> >'partially' the same, where the overlapping structures are made common in
> >your
> >model, so that code can be written generically, but if thats your goal,
> >then the
> >generic code you propose can only be generic if it only interfaces to the
> >common
> >parts, which means, to the generic application code you envision, every
> >device
> >looks the same.  Thats my point
> >
> >Note I'm not opposed to generic code being able to be written for multiple
> >device types, I'm just asserting that that common device is the
> >rte_ethdev,
> >which we already have.  I'm proposing that dataplane elements are network
> >interfaces codified as rte_ethdevs, and crypto devices are codified as
> >some
> >other API, and should you want crypto functionality, you write a pmd
> >(interfaced
> >to using rte_ethdev), that makes use of the to-be-written crypto library.
> >
> >> Some parts of the structures should be common and possible a few common
> >> routines, yes a big part of each device will be different. The
> >>rte_ethdev
> >> structures is what we have today and was a good starting point, but a
> >>big
> >> part of the code will be different.
> >> 
> >No, they shouldn't.  Reuse existing data structures where possible.  But
> >don't
> >try to take device specific interfaces and data, and where you see
> >commonality,
> >force commonality.  Thats what you're trying to do here.  I'd challenge
> >you to
> >find another implementation of device driver interfaces where what you are
> >proposing is done.  It just doesn't make sense.
> >
> >> Should we allow someone to write a new device type that is completely
> >> different down to typedefs for variable types and all new structures, no
> >> we should not, giving them some guide lines by providing a set of
> >>routines
> >> and structures to start with, by all means lets do it.
> >> 
> >At what point did you read that I said we can't use common data
> >structures?  Of
> >course use them where it makes sense.  Want to use mbufs as a common data
> >type
> >for both API's, fine!  That makes sense.  But don't go trying to mash
> >device
> >specific data types and methods into a common structure for some sort of
> >imaginary savings.
> >
> >> >
> >> >> header pointing to the rte_pktdev tx_burst routine. No function
> >>function
> >> >> overhead as the macro in rte_ethdev changes the rte_eth_tx_burst to
> >> >> rte_pkt_tx_burst routine, which BTW is the same routine that was in
> >> >> rte_ethdev today. The pktdev and ethdev are calling into the PMD tx
> >> >> routine via the dev_ops function pointers structure. Which is also no
> >> >> extra over head.
> >> >> 
> >> >Never suggested it was extra overhead, where did you get that from?
> >>I'm
> >> >just
> >> >illustrating what it is you want to do.  If I didn't get it right I
> >> >apologize,
> >> >please clarify.
> >> 
> >> No need to clarify as we agree and maybe I read too much into your
> >> comments.
> >> >
> >> >> If you are calling the rte_pkt_tx_burst routine directly it just
> >>means
> >> >>you
> >> >> need to get the device pointer to pass instead of the port id value
> >>in
> >> >>the
> >> >> rte_pkt_tx_burst routine. The above turns into something like this:
> >> >> 
> >> >> +-----+  +---------+  +--------+  +-----+
> >> >> |     |  | ethdev  |  |        |  |     |  +------+
> >> >> | ARP +--+ map to  +--+ crypto +--+ PMD  +--+ Wire |
> >> >> |     |  | pktdev  |  |        |  |     |  +------+
> >> >> +-----+  +---------+  +--------+  +-----+
> >> >> 
> >> >> So the path of the data is the same only a macro does a simple
> >>rename of
> >> >> the call to rte_eth_tx_burst routine. If you call the pktdev routine
> >> >> directly then the macro is not used.
> >> >> 
> >> >> 
> >> >Let me quash this by stipulating
> >> >that you are absolutely correct, there is no additional overhead in
> >>your
> >> >design,
> >> >it should provide the exact same performance that an rte_ethdev
> >>currently
> >> >does.
> >> >I have no problem with that.  My point is: It provides the exact same
> >> >performance that an rte_ethdev does, so lets just use rte_ethdev if
> >>we're
> >> >trying
> >> >to model devices in the dataplane.
> >> 
> >> The ethdev model does not give us a different device type it just forces
> >> the new device type to look like an ethdev, this is what I am reading
> >>from
> >> your email here. I do not believe you are suggesting that ethdev look a
> >> like is the answer to a new device type.
> >> 
> >What!?  You need to re-read my last email, I'm not sure where you get the
> >above
> >from.  If you'll look back at the dataplane design, I proposed this:
> >
> >+------------+  +---------+ +---------+  +---------+  +--------+
> >|            |  |         | |         |  |         |  |        |
> >|            |  |ethdev   | | ipsec   |  |ethdev   +--+        |
> >| ARP handler+->+api      +-+ tunnel  +->+api      |  | PMD
> >|            |  |         | | PMD     |  |         |  |        |
> >|            |  |         | |         |  |         |  |        |
> >+------------+  +---------+ +---+-----+  +---------+  +--------+
> >                                |
> >                             +--+-----+
> >                             |        |
> >                             |crypto  |
> >                             |api     |
> >                             |        |
> >                             |        |
> >                             +--------+
> >
> >
> >I'm clearly proposing two device models here:
> >
> >1) A dataplane element model, the interface for which is an rte_ethdev.
> >This
> >allows one to implement multiple dataplane elements and functionality (in
> >the
> >example above an ipsec tunnel).  This is a common and well understood
> >device
> >model that many network stacks use (read: easy for end users to
> >understand)
> >
> >2) A crypto device model, implemented independently, reusing existing
> >data types
> >where possible (read: Appropriate for the device class and alternate use
> >cases)
> >
> >(1) uses (2) to implement ipsec functionality.  if an application just
> >wants to
> >do crypto on some arbitrary data, then the crypto api is better suited to
> >doing
> >so (read: not bound by the pktdev api you propose)
> >
> >2 device instances, 2 device models, each best suited for thier purpose.
> >
> >> The bonding model allows you to link some devices together and that is
> >> good in IMO. The bonding model is kind of like a pipeline model at the
> >> device layer. The pipeline style at the application layer could have
> >> provided the same feature as the bonding model just being done at two
> >> different layers in the system (application and device layer).
> >> 
> >Not sure what you're getting at here, but I'll keep reading...
> >
> >> Now the pipeline model is a bit more generic and may add some small
> >> overhead compared to the bonding model. If we require every device type
> >>to
> >> support a bonding style then we would need them all to follow some type
> >>of
> >> common structure, correct?
> >I'm not sure what you're driving at here, please clarify.  Are you trying
> >to
> >suggest that multiple pmds in the datapath require some pmd specific code
> >to
> >allow them to be stacked?  Yes, they currently do, which is less than
> >ideal, and
> >I would certainly support an API to enable ordering of pmds in a
> >datapath, but
> >thats outside the scope of this discussion I think, since we're talking
> >about
> >new device types here.
> >
> >Regarding your comment about multiple pmds requiring a common structure to
> >interface to one another in the datapath (like bonding currently does),
> >you're
> >absolutely right, thats an rte_ethdev.  Thats the way it works today, and
> >theres
> >no reason to change that.
> >
> >> >
> >> >> >
> >> >> >The idea being that now all devices in the dataplane are pktdev
> >>devices
> >> >> >and code
> >> >> >that transmits and receives frames only needs to know that a device
> >>can
> >> >> >transmit
> >> >> >and receive frames.  The crypto device in this chain is ostensibly
> >> >> >preforming
> >> >> >some sort of ipsec functionality so that arp frames are properly
> >> >> >encrypted and
> >> >> >encapsulated for sending via a tunnel.
> >> >> >
> >> >> >On the surface this seems reasonable, and in a sense it is.
> >>However,
> >> >>my
> >> >> >assertion is that we already have this functionality, and it is the
> >> >> >rte_ethdev
> >> >> >device.  To illustrate further, in my view  we can do the above
> >> >>already:
> >> >> >
> >> >> >+------------+  +---------+ +---------+  +---------+  +--------+
> >> >> >|            |  |         | |         |  |         |  |        |
> >> >> >|            |  |ethdev   | | ipsec   |  |ethdev   +--+        |
> >> >> >| ARP handler+->+api      +-+ tunnel  +->+api      |  | PMD
> >> >> >|            |  |         | | PMD     |  |         |  |        |
> >> >> >|            |  |         | |         |  |         |  |        |
> >> >> >+------------+  +---------+ +---+-----+  +---------+  +--------+
> >> >> >                                |
> >> >> >                             +--+-----+
> >> >> >                             |        |
> >> >> >                             |crypto  |
> >> >> >                             |api     |
> >> >> >                             |        |
> >> >> >                             |        |
> >> >> >                             +--------+
> >> >> >
> >> >> >Using the rte_ethdev we can already codify the ipsec functionailty
> >>as a
> >> >> >pmd that
> >> >> >registers an ethdev, and stack it with other pmds using methods
> >> >>simmilar
> >> >> >to what
> >> >> >the bonding pmd does (or via some other more generalized dataplane
> >> >> >indexing
> >> >> >function).  This still leaves us with the creation of the crypto
> >>api,
> >> >> >which is
> >> >> >adventageous because:
> >> >> 
> >> >> The proposal does not remove the bonding method and can still be
> >>used,
> >> >> correct?
> >> >Not sure I fully understand the question, but no, I'm not proposing any
> >> >specific
> >> >change to bonding here, just using it to illustrate that we have ways
> >> >currently
> >> >to stack PMD's on top of one another to create a multi-stage data
> >> >pipeline.
> >> 
> >> Look at the above comment for this one.
> >
> >Ditto.  I'm not making any comment about bonding, just using it as an
> >example
> >to show that its possible to stack ethdevs in a dataplane today.  The
> >method of
> >stacking can certainly be improved, but rte_ethdev is acting as a
> >perfectly good
> >interface for that purpose right now, no changes needed.
> >
> >> >
> >> >> I do not see the different here using the pktdev style routines or
> >>using
> >> >> ethdev routines.
> >> >Exactly my point!  There is no difference to the dataplane API, it
> >>remains
> >> >unchanged, just as in your proposal, except that there is no work
> >> >required to
> >> >create a pktdev device that doesn't change anything.  The advantage
> >>that
> >> >I'm
> >> >proposing here is that my model separates whatever crypto device class
> >> >API you want
> >> >to design from the dataplane (rte_ethdev) api.  Doing this lets you
> >> >create a
> >> >crypto API that is more appropriately suited to the crypto class of
> >> >device.  if
> >> >there is overlap with rte_ethdev, thats fine, you can create simmilar
> >> >functions
> >> >(even giving them identical names to the rte_ethdev apis), and that
> >>will
> >> >be
> >> >properly resolved at compile time.  But more importantly, where there
> >>is
> >> >not
> >> >overlap, you're not forced into creating an API that inherits the
> >>common
> >> >API
> >> >functions you propose, when they are not best suited to your device.
> >> 
> >> I am still not trying to suggest we have the cryptodev be a copy of the
> >> ethdev here. We need some type of structure and common guide lines for a
> >> new device, which pktdev is a very reasonable plus very light framework.
> >> 
> >
> >So, perhaps we're talking past one another here, because I'm not
> >suggesting that
> >cryptodev be a copy of ethdev either, and I thought that was crystal
> >clear in my
> >drawings above.  I'm suggesting that cryptodev be completely independent
> >of any
> >ethdev design (i.e. no overlap with ethdev as  codified in your pktdev
> >proposal).
> >
> >As for being lightweight, I don't see the relevance.  As we agreed above,
> >code
> >savings, either in binary our source format is going to be fairly
> >irrelevant,
> >and not the point of the discussion.  As for _needing_ commonality, I
> >disagree,
> >and cite other implementations in linux and bsd as evidence of that (the
> >device
> >models there, while reusing some datatypes, maintain independent method
> >pointer
> >structures).
> >
> >> I am a bit confused now it seems like you are suggesting a new device
> >> should be ethdev and then you seem to be stating it should be completely
> >> differentŠ
> >> 
> >I'm not, and if you would read my emails closely, I think you would see
> >that.
> >As I tried to state as clearly as I'm able above, I'm proposing two
> >devices: 
> >
> >1) A dataplane element model, the interface for which is an rte_ethdev.
> >This
> >allows one to implement multiple dataplane elements and functionality (in
> >the
> >example above an ipsec tunnel).  This is a common and well understood
> >device
> >model that many network stacks use (read: easy for end users to
> >understand)
> >
> >2) A crypto device model, implemented independently, reusing existing
> >data types
> >where possible (read: Appropriate for the device class and alternate use
> >cases)
> >
> >
> >
> >> Lets just agree some common parts are good between device types and some
> >> are just not. Making all devices look like an ethdev is just not going
> >>to
> >> fly and making a new device type into something completely different is
> >> not going to fly.
> >> 
> >No, lets not, because thats exactly what I'm arguing against.
> >Once again I'll try to re-iterate:
> >
> >I'm fine with reusing existing datatypes if they are applicable
> >
> >I would like to see a device model for cryptodev that implements its own
> >methods
> >that are condusive to the way that hardware works, and not bound to the
> >way
> >other device classes work
> >
> >I would like to see dataplane element functionality that implements other
> >network elements (like an ipsec tunnel), be implemented using the
> >rte_ethdev
> >interface, and have it implement its internals using the to-be-created
> >cryptodev
> >api.
> >
> >
> >> Then the only solution is to have somethings in common to help unify the
> >> designs, but still allow the developer to add APIs to that device which
> >> make the most sense.
> >> >
> >
> >Please tell me, why do you think that unify device models is the _only_
> >solution?  I've provided several examples that simply don't ever do that,
> >and
> >they are quite successfull
> >
> >> >> >
> >> >> >1) It is not constrained by the i/o model of the dataplane (it may
> >> >>include
> >> >> >packet based i/o, but can build on more rudimentary (and performant)
> >> >> >interfaces.
> >> >> >For instance, in addition to async block based i/o, a crypto device
> >>may
> >> >> >also
> >> >> >operate syncrhnously, meaning a call can be saved with each
> >>transaction
> >> >> >(2 calls
> >> >> >for a tx/rx vs one for an encrypt operation).
> >> >> >
> >> >> >2) It is not constrained by use case.  That is to say the API can be
> >> >> >constructed
> >> >> >for more natural use with other functions (for instance encryptions
> >>of
> >> >> >files on
> >> >> >disk or via a pipe to another process), which may not have any
> >>relation
> >> >> >to the
> >> >> >data plane of DPDK.
> >> >> >
> >> >> >Neil
> >> >> 
> >> >> Ok, you snipped the text of the email here an it makes the context
> >>wrong
> >> >I figured your illustration above was sufficient to make your point and
> >> >mine.
> >> >If your code tells a different story, I apologize.
> >> >
> >> >> without the rest of the code IMO. I will try to explain without the
> >>text
> >> >> that was omitted, but it would be best for anyone missing the
> >>original
> >> >> email to read it for more details. I know the formatting got messed
> >>up a
> >> >> bit :-(
> >> >> 
> >> >> http://dpdk.org/ml/archives/dev/2015-April/016124.html
> >> >> 
> >> >> 
> >> >> In the rest of the text it does show the points I wanted to make here
> >> >>and
> >> >> how little overhead it added.
> >> >> 
> >> >As above, I'm stipulating to the fact that there is not performance
> >> >overhead.
> >> >That is not, and was never my point.  My point was/is that we already
> >> >have an
> >> >API to do what you want to do.  The commonality of functions that you
> >> >seem to
> >> >focus on, is already provided by the fact that we have a common device
> >> >for the
> >> >dataplane (the rte_ethdev).  You're proposal doesn't buy us anything
> >> >beyond
> >> >that.  By your own assertion, your proposal:
> >> >
> >> >1) Its just as performant as an rte_ethdev
> >> >
> >> >2) It doesn't require any code change to applications
> >> >
> >> >Given that, and the fact that it does require that new device classes
> >> >inherit an
> >> 
> >> Please do not bring inheritance into this again I wish Bruce had never
> >> used that term :-( in hind site it is not a very good way to describe
> >>the
> >> design.
> >why not, its a perfectly accurate way to describe what you want to do
> >here.  You
> >want to extract elements from rte_ethdev that you feel are going to be
> >common to
> >other device types, place them in their own structure, and have other
> >device
> >types include that common structure, so as to be able to have those
> >methods be a
> >part of the new device.  In UML terms I suppose the relationship would be
> >a
> >"has-a" rather than a "is-a" but the inheritance description is equally
> >relevant
> >when writing in C.
> >
> >> It seems like it at a very high level, but it seems to drag
> >> everyone down to the bits and bytes way too much. We ever stated that
> >>all
> >> the APIs are common between ethdev (or are inherited) between a new
> >>device
> >> only a couple which seemed to make sense to me. The structures being a
> >>bit
> >> similar was my goal and to provide a good light framework.
> >>  
> >Again with the light framework.  We're not talking about code savings
> >anymore.
> >Theres nothing more or less light about just using an rte_ethdev to
> >implement a
> >nework element.  Separate out the network element from the actual crypto
> >functionality.
> >
> >As for not _having_ to use the common structure in all device types, I
> >agree,
> >you don't have to mandate its use, but doesn't that beg the question, why
> >do this
> >then?  If theres not any guaranteed functionality, why do this at all?
> >
> >
> >> >API set that may not be relevant to said device class, my one and only
> >> >question
> >> >is: whats the advantage?
> >> 
> >> I have been trying to layout the advantages:
> >>  1) - we provide a very light framework for the developer of a new
> >>device
> >> type to start with in his design
> >We agreed at the start of this thread that 'lightness' wasn't relevant.
> >Or do
> >you mean to describe something other than code size savings by the term 
> >'light'?
> >
> >>  2) - we look at making some routines common between devices if it make
> >> sense, some non-performance APIs do make sense to be common, but the
> >> structures need to be common in some way IMO
> >
> >I can't parse this sentance.  You're propsal at the start of this thread
> >was to create common structures between device classes.  How is "we look 
> >at
> >making some routines common between devices if it makes sense" an 
> >avantage to
> >your proposal, which is exactly that?  Its like saying changing code is 
> >good,
> >because we get to change code.
> >
> >>  3) - we make a few structures look similar using pktdev structures as 
> >>an
> >> overlay to allow for a clean way to pass around these different device
> >> types
> >
> >This I actually agree with, but it goes back to your first point.  I would
> >argue that passing around rte_ethdevs is just as, if not more useful than
> >passing around pktdevs, because it more fully describes an ethernet 
> >interface.
> >As you note, the pktdev interface is 'lighter' which I take as a 
> >euphamism for
> >'smaller or having less code'.  As we've discussed above however, the 
> >savings
> >are minimal and not paricularly relevant to this debate, so why not just 
> >use
> >rte_ethdev to pass around network element functionality and design a 
> >crypto
> >device interfae independently (I'll remind you here that I'm proposing two
> >devices here, an rte_ethdev and a cryptodev).
> >
> >>  4) - adopt standard APIs or translate those APIs into something similar
> >> for DPDK (e.g. Linux kernel crypto APIs)
> >> 
> >What standard API's have you seen that make common API methods between 
> >device
> >types?  Can you provide an example?
> >
> >> >You claim that common API naming makes application
> >> 
> >> I never claimed that common API naming make applications common, but 
> >>only
> >> a few common constructs or API semantics could help.
> >> 
> >What do you see as the difference?  How, specifically does creating common
> >device model mehtods help an application?  I was under the impression 
> >that, in
> >so doing, you could allow some generic application code to be written 
> >that could
> >interface to either a real ethernet device, or to a crypto device.  What 
> >other
> >advantage do you see?
> >
> >
> >> >code generic, but we have that today.  If you want to do IPsec with a
> >> >crypto
> >> >offload devie, write a pmd that interfaces to the data pipeline via the
> >> >rte_ethdev API and uses whatever crypto offload device API you want to
> >> >construct
> >> >to do the crypto work.  That way, if a given application wants to use 
> >>the
> >> >crypto
> >> 
> >> I am trying to suggest the crypto device can have other APIs better 
> >>suited
> >> to how we want to use that device, again we can have a few common items
> >> between ethdev and cryptodev, but it does not make sense to use all of
> >> ethdev.
> >> 
> >
> >Ok, so this is a good thought.  I think what I hear you saying above is 
> >that you
> >could envision a crypto device having two api sets, yes?  One, a raw API 
> >that
> >applications could use to talk to crypto devices directly, and two, a 
> >network
> >oriented API that can transmit and recieve network frames, right?
> >
> >If thats the case, I'm fine with that.  But if thats interesting to you, 
> >why
> >can't the network interface just be an rte_ethdev that you register like 
> >other
> >pmd's do?
> >
> >> I am getting confused here as you seem to be suggesting we use ethdev
> >> model, but I am sure you are not, correct?
> >> 
> >Correct, two device models, for two devices.  An ipsec device, 
> >implemented as a
> >pmd that uses an rte_ethdev to interface to the network dataplane, and a 
> >crypto
> >device model that the aforementioned pmd interfaces to to encrypt and 
> >decrypt
> >data to implement ipsec functionality.
> >
> >> >hardware for something non-network related (say disk/file encryption), 
> >>it
> >> >can
> >> >use the crypto device layer in a way that is not packet/network 
> >>oriented
> >> >more
> >> >easily, at which point having a common API is completely irrelevant, as
> >> >the
> >> >application code is being specifically written to talk to a crypto 
> >>device.
> >> >
> >> >
> >> >
> >> >
> >> >> Lets just say we do not move the TX/RX routines from rte_ethdev into
> >> >> rte_pktdev and only have a header file with a few structures and 
> >>macros
> >> >>to
> >> >> help define some common parts between each of the new device types 
> >>being
> >> >> added. I could see that as an option, but I do not see the big issues
> >> >>you
> >> >> are pointing out here.
> >> >> 
> >> >My big issue is that I see no point in creating a common API.
> >> 
> >> I am not trying to create a common API between all devices, it only 
> >>seemed
> >> like a reasonable design to make Rx/Tx look similar. Also making some of
> >> the structures look similar is a good thing IMO.
> >> 
> >Ok, you're not trying to make them look identical, I get that.  But you 
> >are
> >trying to get them to try to look identical in terms of how they do data 
> >i/o.
> >I'm suggesting/asserting that such an interface isn't appropriate to all
> >devices.  Even that little bit isn't worth making common, because it won't
> >always be used.
> >
> >> >You're focused on finding a common set of API's that apply to all 
> >>device
> >> >classes, and I'm asserting that there is no safe set of common API
> >> >routines.
> >> >While it may be possible to shoehorn multiple device classes into the
> >> >same i/o
> >> >API model, you are doing yourself a disservice by mandating that.
> >> 
> >> I feel like you are taking the extreme view here again and that was not 
> >>my
> >> intent to make everything the same only where it makes sense. A ethdev 
> >>and
> >> cryptodev are two different animals, but they do have some similarities
> >> and I want to make sure we investigate these similarities, right?
> >> 
> >Yes, they are two different animals, and while they may have some 
> >similarities,
> >I'm suggesting that its not worth codifying those similarities in code.  
> >crypto
> >devices don't send and recieve packets.  They might operate on block 
> >data, so
> >you might use mbufs in both apis, but you don't transmit data to a crypto 
> >dev
> >and you don't receive data from a crypto device.  You could do that I 
> >suppose,
> >but there are better ways, and existing api's show us the way to do that.
> >Remember from above, I'm suggesting two device models here.  A crypto 
> >device
> >that just does cryptography in the fastest/best way possible, and a 
> >optional
> >network element (an ipsec tunnel in our example), that uses the crypto 
> >device to
> >preform its function, but uses an rte_ethdev to transmit and receive 
> >network
> >data.
> >
> >
> >> >
> >> >Put another way, ipsec devices and crypto devices are separate device
> >> >classes
> >> >and deserve to be modeled differently.  An IPsec device is a network
> >> >element
> >> >that is very nicely modeled as an ethernet interface, for which you 
> >>have
> >> >a very
> >> >robust API (the rte_ethdev).  Crypto offload devices are not network
> >> >elements,
> >> >and can preform i/o in a mutltitude of ways (synchronous, asynchronous,
> >> >scatter/gather, etc), and have many non ethernet API facets.  The 
> >>latter
> >> >deserves to have its own API design, not to be forced to inherit part 
> >>of
> >> >an
> >> >ethernet device (rte_pktdev) and be forced to work that way.
> >> >
> >> >> You did have some great comments about how crypto is used and the 
> >>APIs
> >> >> from the Linux Kernel crypto model is proven and I do not disagree.
> >> >> 
> >> >> In my email to my own email I tried to point our we could add 
> >>something
> >> >Yeah, sorry about the broken thread, the in-reply-to field got somehow
> >> >html-mangled.  Not sure how that happened as I use mutt as my MUA.
> >> >
> >> >> very similar to Linux Kernel Crypto API and it would be a model most
> >> >>would
> >> >> be able to understand. Creating my own crypto API is not my goal, 
> >>but to
> >> >> use standards where it makes the most sense to DPDK.
> >> >> 
> >> >> The email link above is the email I suggested the Linux Kernel Crypto
> >> >>API
> >> >> would be reasonable.
> >> >> 
> >> >And thats good.  It doesn't need to be the linux kernel crypto API by 
> >>any
> >> >stretch, and I don't expect it to be.  My take away from review of the
> >> >linux
> >> >crypto api is that, while it uses some simmilar notions to parts of the
> >> >network
> >> >API (scatter gather i/o, optional async operation, completion queues), 
> >>it
> >> >doesn't create any sort of common API structure between it and the 
> >>network
> >> >subsystem, because they're separate devices.  Even though crypto is
> >> >largely used
> >> >in the network datapath for various ipsec tunnels, crypto still uses 
> >>its
> >> >own API
> >> >methods and data types because its not exclusive to the use of
> >> >networking, and
> >> >doing so allows for more flexible use outside the datapath.  Thats what
> >> >I'm
> >> >trying to get at here.  That users might want to use crypto outside of 
> >>the
> >> >network path, and they should have an API that is well suited to that
> >> >task.
> >> 
> >> We agree here, I only want some similarities between devices not an 
> >>exact
> >> common API. 
> >> 
> >> I am afraid I am not following you now. At one point it seems like all 
> >>of
> >> the devices must be completely different then in another place ethdev is
> >> best way to go as a common API. I think we are saying the same thing 
> >>here
> >> just from a different point of view and I have tried to understand your
> >> view point. I have even relaxed my design goals to meet you part way. It
> >> seems like we agree and disagree here, which does not make a lot of 
> >>sense
> >> to me at this point.
> >> 
> >
> >I'll repeat what I've said above several times now:
> >
> >I'd like to see a crypto device api that is developed independently.  It 
> >can use
> >common data types if its suitable to do so, but shouldn't be forced to use
> >simmilar methods from existing device types.  The API here should be 
> >optimized
> >to best reflect the behavior of the hardware
> >
> >If you would like to use the cryptodevice as part of a network dataplane
> >element, you should create a software pmd to register a rte_ethdev.  
> >Internally
> >that pmd should use the crypto dev api to do its encryption work.
> >
> >That way your 'common' device api in the dataplane is the rte_ethdev.  The
> >cryptodev api is then independent of, and optimized for, the cryptodev 
> >hardware.
> >
> >> Maybe we need to get into a room and talk F2F to make sure we are both
> >> being understood here. Maybe we can then make sure we are both being 
> >>heard
> >> and come to some type of solution for DPDK. Maybe the community meeting 
> >>is
> >> a good place, but I feel it is going to be too big and too short to
> >> discuss this correctly. If you are up to a conf call or I can travel to
> >> you if you feel a F2F is better.
> >> 
> >I think we're on opposite coasts, so I don't think a face to face is 
> >going to
> >happen just for this, and that excludes other interests from this 
> >discussion,
> >which I would really like to hear.
> >
> >I've re-read this thread, and I have a suggestion:
> >
> >Implement it.
> >
> >We're clearly approaching this from different angles.  You seem to place 
> >alot of
> >value on the benefits of commonality between device types, while I 
> >clearly dont.
> >I've got several examples in support of my view, but in fairness, the 
> >dpdk isn't
> >the linux or bsd kernel, so while I don't think so, its possible this
> >commonality provides some level of value.  Likewise, I may show that 
> >creating
> >this common structure doesn't provide any real added value.  
> >Additionally, we've
> >both focused on the commonality of devices in this thread (which was the
> >intent), but neither of us have spent any time actually designing the 
> >crypto
> >API, which I think will shed alot of light on the problem.
> >
> >I would suggest choosing a crypto device, maybe the ixp4 crypto engine, 
> >since it
> >has an existing open source implementation, and using it to flesh out 
> >crypto
> >API design.  If you implement it independently of the dpdk (i.e. as part 
> >of its
> >own library with its own API), then we take a step back and look at it, 
> >we can
> >identify the true api overlap and consider the advantages and limitation 
> >of
> >consolidation.
> >
> >Sound like an idea?
> 
> Instead of replying to all of your comments above (sorry) I will finish up 
> the PoC code and then we can poke at the patches.
> 
> I have created a PoC around the code fragments in these emails, but I did 
> not include them as patches. The Quick Assist code on 01.org is being used 
> for the PoC and someone else is doing that work, but I can look at the 
> non-packet API from a top level to be added to the PoC.
> 
> https://01.org/packet-processing/intel%C2%AE-quickassist-technology-drivers
> -and-patches
> 
That sounds like a reasonable approach.  When you have them ready, please reply
to the thread and I'll take a look at it. 
Best
Neil
> 
> Regards,
> ++Keith
> 
> 
    
    
More information about the dev
mailing list