[dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Fri Sep 8 19:08:19 CEST 2017

> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Friday, August 11, 2017 1:49 PM
> To: dev at dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Yigit, Ferruh
> <ferruh.yigit at intel.com>; thomas at monjalon.net
> Subject: [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and
> others
> The SoftNIC PMD is intended to provide SW fall-back options for specific
> ethdev APIs in a generic way to the NICs not supporting those features.
> Currently, the only implemented ethdev API is Traffic Management (TM),
> but other ethdev APIs such as rte_flow, traffic metering & policing, etc
> can be easily implemented.
> Overview:
> * Generic: The SoftNIC PMD works with any "hard" PMD that implements
> the
>   ethdev API. It does not change the "hard" PMD in any way.
> * Creation: For any given "hard" ethdev port, the user can decide to
>   create an associated "soft" ethdev port to drive the "hard" port. The
>   "soft" port is a virtual device that can be created at app start-up
>   through EAL vdev arg or later through the virtual device API.
> * Configuration: The app explicitly decides which features are to be
>   enabled on the "soft" port and which features are still to be used from
>   the "hard" port. The app continues to explicitly configure both the
>   "hard" and the "soft" ports after the creation of the "soft" port.
> * RX/TX: The app reads packets from/writes packets to the "soft" port
>   instead of the "hard" port. The RX and TX queues of the "soft" port are
>   thread safe, as any ethdev.
> * Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
>   so the run function of the "soft" port has to be executed by the CPU in
>   order to get packets moving between "hard" port and the app.
> * Meets the NFV vision: The app should be (almost) agnostic about the NIC
>   implementation (different vendors/models, HW-SW mix), the app should
> not
>   require changes to use different NICs, the app should use the same API
>   for all NICs. If a NIC does not implement a specific feature, the HW
>   should be augmented with SW to meet the functionality while still
>   preserving the same API.
> Traffic Management SW fall-back overview:
> * Implements the ethdev traffic management API (rte_tm.h).
> * Based on the existing librte_sched DPDK library.
> Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
> feature with default settings:
>           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> Q1: Why generic name, if only TM is supported (for now)?
> A1: The intention is to have SoftNIC PMD implement many other (all?)
>     ethdev APIs under a single "ideal" ethdev, hence the generic name.
>     The initial motivation is TM API, but the mechanism is generic and can
>     be used for many other ethdev APIs. Somebody looking to provide SW
>     fall-back for other ethdev API is likely to end up inventing the same,
>     hence it would be good to consolidate all under a single PMD and have
>     the user explicitly enable/disable the features it needs for each
>     "soft" device.
> Q2: Are there any performance requirements for SoftNIC?
> A2: Yes, performance should be great/decent for every feature, otherwise
>     the SW fall-back is unusable, thus useless.
> Q3: Why not change the "hard" device (and keep a single device) instead of
>     creating a new "soft" device (and thus having two devices)?
> A3: This is not possible with the current librte_ether ethdev
>     implementation. The ethdev->dev_ops are defined as constant structure,
>     so it cannot be changed per device (nor per PMD). The new ops also
>     need memory space to store their context data structures, which
>     requires updating the ethdev->data->dev_private of the existing
>     device; at best, maybe a resize of ethdev->data->dev_private could be
>     done, assuming that librte_ether will introduce a way to find out its
>     size, but this cannot be done while device is running. Other side
>     effects might exist, as the changes are very intrusive, plus it likely
>     needs more changes in librte_ether.
> Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
>     devices which do not support the specific feature? If the device
>     supports the capability, let's call its dev_ops, otherwise call the
>     SW fall-back dev_ops.
> A4: First, similar reasons to Q&A3. This fixes the need to change
>     ethdev->dev_ops of the device, but it does not do anything to fix the
>     other significant issue of where to store the context data structures
>     needed by the SW fall-back functions (which, in this approach, are
>     called implicitly by librte_ether).
>     Second, the SW fall-back options should not be restricted arbitrarily
>     by the librte_ether library, the decision should belong to the app.
>     For example, the TM SW fall-back should not be limited to only
>     librte_sched, which (like any SW fall-back) is limited to a specific
>     hierarchy and feature set, it cannot do any possible hierarchy. If
>     alternatives exist, the one to use should be picked by the app, not by
>     the ethdev layer.
> Q5: Why is the app required to continue to configure both the "hard" and
>     the "soft" devices even after the "soft" device has been created? Why
>     not hiding the "hard" device under the "soft" device and have the
>     "soft" device configure the "hard" device under the hood?
> A5: This was the approach tried in the V2 of this patch set (overlay
>     "soft" device taking over the configuration of the underlay "hard"
>     device) and eventually dropped due to increased complexity of having
>     to keep the configuration of two distinct devices in sync with
>     librte_ether implementation that is not friendly towards such
>     approach. Basically, each ethdev API call for the overlay device
>     needs to configure the overlay device, invoke the same configuration
>     with possibly modified parameters for the underlay device, then resume
>     the configuration of overlay device, turning this into a device
>     emulation project.
>     V2 minuses: increased complexity (deal with two devices at same time);
>     need to implement every ethdev API, even those not needed for the
> scope
>     of SW fall-back; intrusive; sometimes have to silently take decisions
>     that should be left to the app.
>     V3 pluses: lower complexity (only one device); only need to implement
>     those APIs that are in scope of the SW fall-back; non-intrusive (deal
>     with "hard" device through ethdev API); app decisions taken by the app
>     in an explicit way.
> Q6: Why expose the SW fall-back in a PMD and not in a SW library?
> A6: The SW fall-back for an ethdev API has to implement that specific
>     ethdev API, (hence expose an ethdev object through a PMD), as opposed
>     to providing a different API. This approach allows the app to use the
>     same API (NFV vision). For example, we already have a library for TM
>     SW fall-back (librte_sched) that can be called directly by the apps
>     that need to call it outside of ethdev context (use-cases exist), but
>     an app that works with TM-aware NICs through the ethdev TM API would
>     have to be changed significantly in order to work with different
>     TM-agnostic NICs through the librte_sched API.
> Q7: Why have all the SW fall-backs in a single PMD? Why not develop
>     the SW fall-back for each different ethdev API in a separate PMD, then
>     create a chain of "soft" devices for each "hard" device? Potentially,
>     this results in smaller size PMDs that are easier to maintain.
> A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
>     1. All the existing PMDs for HW NICs implement a lot of features under
>        the same PMD, so there is no reason for single PMD approach to break
>        code modularity. See the V3 code, a lot of care has been taken for
>        code modularity.
>     2. We should avoid the proliferation of SW PMDs.
>     3. A single device should be handled by a single PMD.
>     4. People are used with feature-rich PMDs, not with single-feature
>        PMDs, so we change of mindset?
>     5. [Configuration nightmare] A chain of "soft" devices attached to
>        single "hard" device requires the app to be aware that the N "soft"
>        devices in the chain plus the "hard" device refer to the same HW
>        device, and which device should be invoked to configure which
>        feature. Also the length of the chain and functionality of each
>        link is different for each HW device. This breaks the requirement
>        of preserving the same API while working with different NICs (NFV).
>        This most likely results in a configuration nightmare, nobody is
>        going to seriously use this.
>     6. [Feature inter-dependecy] Sometimes different features need to be
>        configured and executed together (e.g. share the same set of
>        resources, are inter-dependent, etc), so it is better and more
>        performant to do them in the same ethdev/PMD.
>     7. [Code duplication] There is a lot of duplication in the
>        configuration code for the chain of ethdevs approach. The ethdev
>        dev_configure, rx_queue_setup, tx_queue_setup API functions have to
>        be implemented per device, and they become meaningless/inconsistent
>        with the chain approach.
>     8. [Data structure duplication] The per device data structures have to
>        be duplicated and read repeatedly for each "soft" ethdev. The
>        ethdev device, dev_private, data, per RX/TX queue data structures
>        have to be replicated per "soft" device. They have to be re-read for
>        each stage, so the same cache misses are now multiplied with the
>        number of stages in the chain.
>     9. [rte_ring proliferation] Thread safety requirements for ethdev
>        RX/TXqueues require an rte_ring to be used for every RX/TX queue
>        of each "soft" ethdev. This rte_ring proliferation unnecessarily
>        increases the memory footprint and lowers performance, especially
>        when each "soft" ethdev ends up on a different CPU core (ping-pong
>        of cache lines).
>     10.[Meta-data proliferation] A chain of ethdevs is likely to result
>        in proliferation of meta-data that has to be passed between the
>        ethdevs (e.g. policing needs the output of flow classification),
>        which results in more cache line ping-pong between cores, hence
>        performance drops.
> Cristian Dumitrescu (4):
> Jasvinder Singh (4):
>   net/softnic: add softnic PMD
>   net/softnic: add traffic management support
>   net/softnic: add TM capabilities ops
>   net/softnic: add TM hierarchy related ops
>  MAINTAINERS                                     |    5 +
>  config/common_base                              |    5 +
>  drivers/net/Makefile                            |    5 +
>  drivers/net/softnic/Makefile                    |   57 +
>  drivers/net/softnic/rte_eth_softnic.c           |  867 ++++++
>  drivers/net/softnic/rte_eth_softnic.h           |   70 +
>  drivers/net/softnic/rte_eth_softnic_internals.h |  291 ++
>  drivers/net/softnic/rte_eth_softnic_tm.c        | 3446
> +++++++++++++++++++++++
>  drivers/net/softnic/rte_eth_softnic_version.map |    7 +
>  mk/rte.app.mk                                   |    5 +-
>  10 files changed, 4757 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/softnic/Makefile
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map
> --
> 2.9.3

Ping Thomas and Stephen.

You guys previously had some comments looking to suggest the approach of augmenting the "hard" device as opposed to creating "soft" device for the "hard" device. We tried to address these comments in the Q& A list below (see Q&As 3 and 4).

Have your comments been addressed or do we need to talk more on them?


More information about the dev mailing list