[dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others

Singh, Jasvinder jasvinder.singh at intel.com
Sat Sep 23 00:07:50 CEST 2017

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas at monjalon.net]
> Sent: Wednesday, September 20, 2017 4:36 PM
> To: Singh, Jasvinder <jasvinder.singh at intel.com>; Dumitrescu, Cristian
> <cristian.dumitrescu at intel.com>
> Cc: dev at dpdk.org; Yigit, Ferruh <ferruh.yigit at intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for
> traffic mgmt and others
> Hi,
> 18/09/2017 11:10, Jasvinder Singh:
> > The SoftNIC PMD is intended to provide SW fall-back options for
> > specific ethdev APIs in a generic way to the NICs not supporting those
> features.
> I agree it is important to have a solution in DPDK to better manage SW
> fallbacks. One question is to know whether we can implement and maintain
> many solutions. We probably must choose only one solution.
> I have not read the code. I am just interested in the design for now.
> I think it is a smart idea but maybe less convenient than calling fallback from
> ethdev API glue code. My opinion has not changed since v1.

IMHO, Calling fallback from ethdev API glue code suffers from scalability issue. Let's assume the scenario of having another
sw fallback implementation for TM or its specific feature is available. What will be the approach when we already have something glued in TM API? 
The softnic could be considered as a placeholder for adding and enabling more features at any granularity in addition
to having complete TM feature.  

> Thanks for the detailed explanations. Let's discuss below.
> [...]
> > * RX/TX: The app reads packets from/writes packets to the "soft" port
> >   instead of the "hard" port. The RX and TX queues of the "soft" port are
> >   thread safe, as any ethdev.
> "thread safe as any ethdev"?
> I would say the ethdev queues are not thread safe.

[Jasvinder] Agree.

> [...]
> > * Meets the NFV vision: The app should be (almost) agnostic about the NIC
> >   implementation (different vendors/models, HW-SW mix), the app should
> not
> >   require changes to use different NICs, the app should use the same API
> >   for all NICs. If a NIC does not implement a specific feature, the HW
> >   should be augmented with SW to meet the functionality while still
> >   preserving the same API.
> This goal could also be achieved by adding the SW capability to the API.
> After getting capabilities of a hardware, the app could set the capability of
> some driver features to "SW fallback".
> So the capability would become a tristate:
> 	- not supported
> 	- HW supported
> 	- SW supported
> The unique API goal is failed if we must manage two ports, the HW port for
> some features and the softnic port for other features.
> You explain it in A5 below.

[Jasvinder]  TM API is agnostic to underlying implementation and allows applications to implement
solution on SW, HW or hybrid of HW and SW at any granularity, and on any number of devices depending
upon the availability of features. No Restriction. Thus, managing and configuring number of devices (physical and virtual) using high
level api is at the disposal of application level framework. When softnic device is enabled, application sends and receives packet
from soft device instead of hard device as soft device implements the features missing in hard device. It deosn't mean that softnic
device should hide the hard device. However, it doesn't restrict application to communicate directly with hard device. If desired, application
can bypass the softnic device and send tx packet straight to hard device through the queues not used by soft device.

> [...]
> > Example: Create "soft" port for "hard" port "0000:04:00.1", enable the
> > TM feature with default settings:
> >           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> So the app will use only the vdev net_softnic0 which will forward packets to
> 0000:04:00.1?
> Can we say in this example that net_softnic0 owns 0000:04:00.1?
> Probably not, because the config of the HW must be done separately (cf. Q5).
> See my "ownership proposal":
> 	http://dpdk.org/ml/archives/dev/2017-September/074656.html
> The issue I see in this example is that we must define how to enable every
> features. It should be equivalent to defining the ethdev capabilities.
> In this example, the option soft_tm=on is probably not enough fine-grain.
> We could support some parts of TM API in HW and other parts in SW.
[Jasvinder] - This is one instance where the complete hierarchical scheduler is presented as software fallback. But the
approach doesn't restrict to add more features (at any granularity) to softnic and enable them altogether
naming arguments during device creation.

> [...]
> > Q3: Why not change the "hard" device (and keep a single device) instead of
> >     creating a new "soft" device (and thus having two devices)?
> > A3: This is not possible with the current librte_ether ethdev
> >     implementation. The ethdev->dev_ops are defined as constant structure,
> >     so it cannot be changed per device (nor per PMD). The new ops also
> >     need memory space to store their context data structures, which
> >     requires updating the ethdev->data->dev_private of the existing
> >     device; at best, maybe a resize of ethdev->data->dev_private could be
> >     done, assuming that librte_ether will introduce a way to find out its
> >     size, but this cannot be done while device is running. Other side
> >     effects might exist, as the changes are very intrusive, plus it likely
> >     needs more changes in librte_ether.
> Q3 is about calling SW fallback from the driver code, right?
> We must not implement fallbacks in drivers because it would hide it to the
> application.
> If a feature is not available in hardware, the application can choose to bypass
> this feature or integrate the fallback in its own workflow.

[Jasvinder]: Naturally, if hard device has the TM feature, then TM specific ops which are invoked API function are implemented by its pmd.
Similar approach has been followed in sw fallback solution as softnic port that complements the hard device.
> > Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
> >     devices which do not support the specific feature? If the device
> >     supports the capability, let's call its dev_ops, otherwise call the
> >     SW fall-back dev_ops.
> > A4: First, similar reasons to Q&A3. This fixes the need to change
> >     ethdev->dev_ops of the device, but it does not do anything to fix the
> >     other significant issue of where to store the context data structures
> >     needed by the SW fall-back functions (which, in this approach, are
> >     called implicitly by librte_ether).
> >     Second, the SW fall-back options should not be restricted arbitrarily
> >     by the librte_ether library, the decision should belong to the app.
> >     For example, the TM SW fall-back should not be limited to only
> >     librte_sched, which (like any SW fall-back) is limited to a specific
> >     hierarchy and feature set, it cannot do any possible hierarchy. If
> >     alternatives exist, the one to use should be picked by the app, not by
> >     the ethdev layer.
> Q4 is about calling SW callback from the API glue code, right?
> We could summarize Q3/Q4 as "it could be done but we propose another
> way".
> I think we must consider the pros and cons of both approaches from a user
> perspective.
> I agree the application must decide which fallback to use.
> We could propose one fallback in ethdev which can be enabled explicitly (see
> my tristate capabilities proposal above).

[Jasvinder] As explained above as well, the approach of sticking sw solution with API will
create issue of scalability. How two different SW solution will coexist or for instance N software solutions.
That's why the softnic (virtual device) is proposed as an alternative which could be extended
to include and enable features.
> > Q5: Why is the app required to continue to configure both the "hard" and
> >     the "soft" devices even after the "soft" device has been created? Why
> >     not hiding the "hard" device under the "soft" device and have the
> >     "soft" device configure the "hard" device under the hood?
> > A5: This was the approach tried in the V2 of this patch set (overlay
> >     "soft" device taking over the configuration of the underlay "hard"
> >     device) and eventually dropped due to increased complexity of having
> >     to keep the configuration of two distinct devices in sync with
> >     librte_ether implementation that is not friendly towards such
> >     approach. Basically, each ethdev API call for the overlay device
> >     needs to configure the overlay device, invoke the same configuration
> >     with possibly modified parameters for the underlay device, then resume
> >     the configuration of overlay device, turning this into a device
> >     emulation project.
> >     V2 minuses: increased complexity (deal with two devices at same time);
> >     need to implement every ethdev API, even those not needed for the
> scope
> >     of SW fall-back; intrusive; sometimes have to silently take decisions
> >     that should be left to the app.
> >     V3 pluses: lower complexity (only one device); only need to implement
> >     those APIs that are in scope of the SW fall-back; non-intrusive (deal
> >     with "hard" device through ethdev API); app decisions taken by the app
> >     in an explicit way.
> I think it is breaking what you call the NFV vision in several places.

[Jasvinder] Mentioning nfv vision is about hiding heterogeneous implementation
(HW,SW, HW-SW hybrid) under the abstraction layer provided by TM API
instead of restricting app to use API for specific port.
> [...]
> >     9. [rte_ring proliferation] Thread safety requirements for ethdev
> >        RX/TXqueues require an rte_ring to be used for every RX/TX queue
> >        of each "soft" ethdev. This rte_ring proliferation unnecessarily
> >        increases the memory footprint and lowers performance, especially
> >        when each "soft" ethdev ends up on a different CPU core (ping-pong
> >        of cache lines).
> I am curious to understand why you consider thread safety as a requirement
> for queues. No need to reply here, the question is already asked at the
> beginning of this email ;)

More information about the dev mailing list