[dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation

Ananyev, Konstantin konstantin.ananyev at intel.com
Tue Apr 21 10:40:05 CEST 2015


> -----Original Message-----
> From: Richardson, Bruce
> Sent: Monday, April 20, 2015 4:03 PM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org; Wiles, Keith
> Subject: Re: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation
> 
> On Mon, Apr 20, 2015 at 12:26:43PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > > Sent: Friday, April 17, 2015 4:17 PM
> > > To: dev at dpdk.org; Wiles, Keith
> > > Subject: [dpdk-dev] [RFC PATCH 1/4] Add example pktdev implementation
> > >
> > > This commit demonstrates what a minimal API for all packet handling
> > > types would look like. It simply provides the necessary parts for
> > > receiving and transmiting packets, and is based off the ethdev
> > > implementation.
> > > ---
> > >  config/common_bsdapp           |   5 ++
> > >  config/common_linuxapp         |   5 ++
> > >  lib/Makefile                   |   1 +
> > >  lib/librte_pktdev/Makefile     |  56 ++++++++++++++++
> > >  lib/librte_pktdev/rte_pktdev.c |  35 ++++++++++
> > >  lib/librte_pktdev/rte_pktdev.h | 144 +++++++++++++++++++++++++++++++++++++++++
> > >  6 files changed, 246 insertions(+)
> > >  create mode 100644 lib/librte_pktdev/Makefile
> > >  create mode 100644 lib/librte_pktdev/rte_pktdev.c
> > >  create mode 100644 lib/librte_pktdev/rte_pktdev.h
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > index 8ff4dc2..d2b932c 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -132,6 +132,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
> > >  CONFIG_RTE_LIBRTE_KVARGS=y
> > >
> > >  #
> > > +# Compile generic packet handling device library
> > > +#
> > > +CONFIG_RTE_LIBRTE_PKTDEV=y
> > > +
> > > +#
> > >  # Compile generic ethernet library
> > >  #
> > >  CONFIG_RTE_LIBRTE_ETHER=y
> > > diff --git a/config/common_linuxapp b/config/common_linuxapp
> > > index 09a58ac..5bda416 100644
> > > --- a/config/common_linuxapp
> > > +++ b/config/common_linuxapp
> > > @@ -129,6 +129,11 @@ CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
> > >  CONFIG_RTE_LIBRTE_KVARGS=y
> > >
> > >  #
> > > +# Compile generic packet handling device library
> > > +#
> > > +CONFIG_RTE_LIBRTE_PKTDEV=y
> > > +
> > > +#
> > >  # Compile generic ethernet library
> > >  #
> > >  CONFIG_RTE_LIBRTE_ETHER=y
> > > diff --git a/lib/Makefile b/lib/Makefile
> > > index d94355d..4db5ee0 100644
> > > --- a/lib/Makefile
> > > +++ b/lib/Makefile
> > > @@ -32,6 +32,7 @@
> > >  include $(RTE_SDK)/mk/rte.vars.mk
> > >
> > >  DIRS-y += librte_compat
> > > +DIRS-$(CONFIG_RTE_LIBRTE_PKTDEV) += librte_pktdev
> > >  DIRS-$(CONFIG_RTE_LIBRTE_EAL) += librte_eal
> > >  DIRS-$(CONFIG_RTE_LIBRTE_MALLOC) += librte_malloc
> > >  DIRS-$(CONFIG_RTE_LIBRTE_RING) += librte_ring
> > > diff --git a/lib/librte_pktdev/Makefile b/lib/librte_pktdev/Makefile
> > > new file mode 100644
> > > index 0000000..2d3b3a1
> > > --- /dev/null
> > > +++ b/lib/librte_pktdev/Makefile
> > > @@ -0,0 +1,56 @@
> > > +#   BSD LICENSE
> > > +#
> > > +#   Copyright(c) 2015 Intel Corporation. All rights reserved.
> > > +#   All rights reserved.
> > > +#
> > > +#   Redistribution and use in source and binary forms, with or without
> > > +#   modification, are permitted provided that the following conditions
> > > +#   are met:
> > > +#
> > > +#     * Redistributions of source code must retain the above copyright
> > > +#       notice, this list of conditions and the following disclaimer.
> > > +#     * Redistributions in binary form must reproduce the above copyright
> > > +#       notice, this list of conditions and the following disclaimer in
> > > +#       the documentation and/or other materials provided with the
> > > +#       distribution.
> > > +#     * Neither the name of Intel Corporation nor the names of its
> > > +#       contributors may be used to endorse or promote products derived
> > > +#       from this software without specific prior written permission.
> > > +#
> > > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > > +
> > > +include $(RTE_SDK)/mk/rte.vars.mk
> > > +
> > > +#
> > > +# library name
> > > +#
> > > +LIB = libpktdev.a
> > > +
> > > +CFLAGS += -O3
> > > +CFLAGS += $(WERROR_FLAGS)
> > > +
> > > +EXPORT_MAP := rte_pktdev_version.map
> > > +
> > > +LIBABIVER := 1
> > > +
> > > +SRCS-y += rte_pktdev.c
> > > +
> > > +#
> > > +# Export include files
> > > +#
> > > +SYMLINK-y-include += rte_pktdev.h
> > > +
> > > +# this lib depends upon no others:
> > > +DEPDIRS-y +=
> > > +
> > > +include $(RTE_SDK)/mk/rte.lib.mk
> > > diff --git a/lib/librte_pktdev/rte_pktdev.c b/lib/librte_pktdev/rte_pktdev.c
> > > new file mode 100644
> > > index 0000000..4c32d86
> > > --- /dev/null
> > > +++ b/lib/librte_pktdev/rte_pktdev.c
> > > @@ -0,0 +1,36 @@
> > > +/*-
> > > + *   BSD LICENSE
> > > + *
> > > + *   Copyright(c) 2015 Intel Corporation. All rights reserved.
> > > + *   All rights reserved.
> > > + *
> > > + *   Redistribution and use in source and binary forms, with or without
> > > + *   modification, are permitted provided that the following conditions
> > > + *   are met:
> > > + *
> > > + *     * Redistributions of source code must retain the above copyright
> > > + *       notice, this list of conditions and the following disclaimer.
> > > + *     * Redistributions in binary form must reproduce the above copyright
> > > + *       notice, this list of conditions and the following disclaimer in
> > > + *       the documentation and/or other materials provided with the
> > > + *       distribution.
> > > + *     * Neither the name of Intel Corporation nor the names of its
> > > + *       contributors may be used to endorse or promote products derived
> > > + *       from this software without specific prior written permission.
> > > + *
> > > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > > + */
> > > +
> > > +#include "rte_pktdev.h"
> > > +
> > > +/* For future use */
> > > diff --git a/lib/librte_pktdev/rte_pktdev.h b/lib/librte_pktdev/rte_pktdev.h
> > > new file mode 100644
> > > index 0000000..8a5699a
> > > --- /dev/null
> > > +++ b/lib/librte_pktdev/rte_pktdev.h
> > > @@ -0,0 +1,144 @@
> > > +/*-
> > > + *   BSD LICENSE
> > > + *
> > > + *   Copyright(c) 2015 Intel Corporation. All rights reserved.
> > > + *   All rights reserved.
> > > + *
> > > + *   Redistribution and use in source and binary forms, with or without
> > > + *   modification, are permitted provided that the following conditions
> > > + *   are met:
> > > + *
> > > + *     * Redistributions of source code must retain the above copyright
> > > + *       notice, this list of conditions and the following disclaimer.
> > > + *     * Redistributions in binary form must reproduce the above copyright
> > > + *       notice, this list of conditions and the following disclaimer in
> > > + *       the documentation and/or other materials provided with the
> > > + *       distribution.
> > > + *     * Neither the name of Intel Corporation nor the names of its
> > > + *       contributors may be used to endorse or promote products derived
> > > + *       from this software without specific prior written permission.
> > > + *
> > > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > > + */
> > > +
> > > +#ifndef _RTE_PKTDEV_H_
> > > +#define _RTE_PKTDEV_H_
> > > +
> > > +#include <stdint.h>
> > > +
> > > +/**
> > > + * @file
> > > + *
> > > + * RTE Packet Processing Device API
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +/* forward definition of mbuf structure. We don't need full mbuf header here */
> > > +struct rte_mbuf;
> > > +
> > > +#define RTE_PKT_NAME_MAX_LEN (32)
> > > +
> > > +typedef uint16_t (*pkt_rx_burst_t)(void *rxq,
> > > +				   struct rte_mbuf **rx_pkts,
> > > +				   uint16_t nb_pkts);
> > > +/**< @internal Retrieve packets from a queue of a device. */
> > > +
> > > +typedef uint16_t (*pkt_tx_burst_t)(void *txq,
> > > +				   struct rte_mbuf **tx_pkts,
> > > +				   uint16_t nb_pkts);
> > > +/**< @internal Send packets on a queue of a device. */
> > > +
> > > +#define RTE_PKT_DEV_HDR(structname) struct { \
> > > +	pkt_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ \
> > > +	pkt_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ \
> > > +	struct structname ## _data *data;  /**< Pointer to device data */ \
> > > +}
> > > +
> > > +#define RTE_PKT_DEV_DATA_HDR struct { \
> > > +	char name[RTE_PKT_NAME_MAX_LEN]; /**< Unique identifier name */ \
> > > +\
> > > +	void **rx_queues; /**< Array of pointers to RX queues. */ \
> > > +	void **tx_queues; /**< Array of pointers to TX queues. */ \
> > > +	uint16_t nb_rx_queues; /**< Number of RX queues. */ \
> > > +	uint16_t nb_tx_queues; /**< Number of TX queues. */ \
> > > +}
> > > +
> > > +struct rte_pkt_dev {
> > > +	RTE_PKT_DEV_HDR(rte_pkt_dev);
> > > +};
> > > +
> > > +struct rte_pkt_dev_data {
> > > +	RTE_PKT_DEV_DATA_HDR;
> > > +};
> > > +
> > > +/**
> > > + *
> > > + * Retrieve a burst of input packets from a receive queue of a
> > > + * device. The retrieved packets are stored in *rte_mbuf* structures whose
> > > + * pointers are supplied in the *rx_pkts* array.
> > > + *
> > > + * @param dev
> > > + *   The device to be polled for packets
> > > + * @param queue_id
> > > + *   The index of the receive queue from which to retrieve input packets.
> > > + * @param rx_pkts
> > > + *   The address of an array of pointers to *rte_mbuf* structures that
> > > + *   must be large enough to store *nb_pkts* pointers in it.
> > > + * @param nb_pkts
> > > + *   The maximum number of packets to retrieve.
> > > + * @return
> > > + *   The number of packets actually retrieved, which is the number
> > > + *   of pointers to *rte_mbuf* structures effectively supplied to the
> > > + *   *rx_pkts* array.
> > > + */
> > > +static inline uint16_t
> > > +rte_pkt_rx_burst(struct rte_pkt_dev *dev, uint16_t queue_id,
> > > +		 struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> > > +{
> > > +	return (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
> > > +			rx_pkts, nb_pkts);
> > > +}
> > > +
> > > +/**
> > > + * Send a burst of output packets on a transmit queue of a device.
> > > + *
> > > + * @param dev
> > > + *   The device to be given the packets.
> > > + * @param queue_id
> > > + *   The index of the queue through which output packets must be sent.
> > > + * @param tx_pkts
> > > + *   The address of an array of *nb_pkts* pointers to *rte_mbuf* structures
> > > + *   which contain the output packets.
> > > + * @param nb_pkts
> > > + *   The maximum number of packets to transmit.
> > > + * @return
> > > + *   The number of output packets actually stored in transmit descriptors of
> > > + *   the transmit ring. The return value can be less than the value of the
> > > + *   *tx_pkts* parameter when the transmit ring is full or has been filled up.
> > > + */
> > > +static inline uint16_t
> > > +rte_pkt_tx_burst(struct rte_pkt_dev *dev, uint16_t queue_id,
> > > +		 struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
> > > +{
> > > +	return (*dev->tx_pkt_burst)(dev->data->tx_queues[queue_id], tx_pkts, nb_pkts);
> > > +}
> >
> > That one looks much more lightweight, then Keith one :)
> > I have a question here:
> > Why are you guys so confident, that all foreseeable devices would fit into current eth_dev rx_burst/tx_burst API?
> > As I understand, QAT devices have HW request/response ring pairs, so the idea probably is to make tx_burst()
> > populate an request ring and rx_burst() to read from response ring, right?
> > Though right now rte_mbuf contains a lot of flags and data fields that are specific for ethdev, and probably have no sense for crypto dev.
> > From other side, for QAT devices, I suppose you'll need crypto specific flags and data:
> > encrypt/decrypt, source and destination buffer addresses, some cipher specific data, etc.
> > Wonder do you plan to fit all that into current rte_mbuf structure, or do you plan to have some superset structure,
> > that would consist of rte_mbuf plus some extra stuff, or ... ?
> > Konstantin
> >
> >
> 
> From my point of view, the purpose of a pktdev API to to define a common API
> for devices that read/write (i.e. RX/TX) mbufs. If other devices don't fit that
> model, then they don't use it. However, we already have a number of different
> device types that already read/write mbufs, and there are likely more in future,
> so it would be nice to have a common API that can be used across those types so
> the data path can be source-neutral when processing mbufs. So far, we have
> three proposals on how that might be achieved. :-)

Ok, probably I misunderstood previous RFCs about pktdev.
So pktdev is not going to be a generic abstraction for any foreseeable device types: eth, crypto, compress, dpi, etc?
It would represent only subset of dev types: real eth devices, plus SW emulated ones 
that can mimic rx_burst/tx_burst, but don't support any other eth specific ops (kni, pmd_ring)?
Basically what we support right now?
And for future device types to support (crypto, etc) there would be something else? 

> 
> As for the crypto, there is still a lot of investigation work to be done here,
> and Keith is probably better placed than me to comment on it. But if you look at any
> application using DPDK for packet processing, the mbuf is still going to be at
> the heart of it, so whatever the actual crypto API is like, it's going to have
> to have to fit into an "mbuf world" anyway.
> If the crypto-dev APIs don't work
> with mbufs, then it's pushing work onto the app (or higher level libs) to extract
> the meaningful fields from the mbuf to pass to the dev.

As I understand application (or some upper layer lib) would have to do that work anyway: 
RX packet, find and read related to this packet crypto data, and pass all that info to the crypto dev, right?
I can hardly imagine, that in reality you always would be able to do:

n = rx_burst(eth_dev, pkts, n);
n = tx_burst(crypto_dev, pkts, n);

without any processing of incoming packets in between.
The difference is only how that information will be passed into crypto device:
Would you need to to update an mbuf, or fill some other structure.  

> Therefore, in a DPDK
> library, I would think it better to have the dev work with mbufs directly, as
> the rest of DPDK does.

I wouldn't say that all DPDK libs accepts data as mbufs only.
Only those where it makes sense: PMDs, ip_frag/reassembly, pipeline fw.
Let say rte_hash and rte_acl accept input data in a neutral way.
You wouldn't expect rte_hash_lookup() or rte_acl_classify() to accept pointer(s) to the mbuf.
if you'll have a SW implemented encrypt()/decrypt() routines, you probably wouldn't require an mbuf as input parameter either.
Wouldn't it be more plausible to create a new struct rte_crypto_buf (or something) to pass information to/from crypto devices.
Then we can provide for it an ability to attach/detach to rte_mbuf(s) buf_addr(s).
That way we can probably make crypto_buf to attach to 2 mbufs at once (source and destination).

Konstantin

> For the extra data, my suggestion would be to have the
> mbuf store a pointer to a crypto context, rather than trying to store it directly
> in the mbuf itself - there's no way we'd fit a crypto key in there! - and
> the context would be shared among a set of packets rather than per-mbuf.
> 
> /Bruce



More information about the dev mailing list