[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Stephen Hemminger stephen at networkplumber.org
Fri Feb 7 21:34:00 CET 2020

Previous message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Next message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 7 Feb 2020 19:48:17 +0530
Jerin Jacob <jerinjacobk at gmail.com> wrote:

> On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <david.coyle at intel.com> wrote:
> >
> > Hi Jerin, see below  
> 
> Hi David,
> 
> > >
> > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.coyle at intel.com>
> > > wrote:
> > >  
> 
> > >
> > > There is a risk in drafting API that meant for HW without any HW exists.
> > > Because there could be inefficiency on the metadata and fast path API for
> > > both models.
> > > For example, In the case of CPU based scheme, it will be pure overhead
> > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > > where CPU works better in the synchronous model and I have doubt that the
> > > session-based scheme will work for HW or not as both difference  HW needs
> > > to work hand in hand(IOMMU aspects for two PCI device)  
> >
> > [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs.
> > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient.
> > The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach.  
> 
> In this release, we added CPU crypto support in cryptodev to support
> the synchronous model to fix the overhead.
> 
> >
> > From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability -  the application doesn't need to know
> > the device type as such and therefore doesn't need to make different API calls.
> >
> > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future.
> > As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto
> > is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane
> > pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks.
> >
> > We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development.
> >
> > Finally, I'm unsure what you mean by this line:
> >
> >         "I have doubt that the session-based scheme will work for HW or not as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI device)"
> >
> > What do mean by different HW working "hand in hand" and "two PCI device"?
> > The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain  
> 
> I was thinking, it will be N PCIe devices that create the chain. Each
> distinct PCI device does the fixed-function and chains them together.
> 
> I do understand the usage of QAT HW and CRC in SW.
> So If I understand it correctly, in rte_security, we are combining
> rte_ethdev and rte_cryptodev. With this spec, we are trying to
> combine,
> rte_cryptodev and rte_compressdev. So it looks good to me. My only
> remaining concern is the name of this API, accelerator too generic
> name. IMO, like rte_security, we may need to give more meaningful name
> for the use case where crytodev and compressdev can work together.

Having an API that could be used by parallel hardware does make sense,
but the DPDK already has multiple packet processing infrastructure pieces.

I would rather the DPDK converge on one widely used, robust and tested packet
method. Rather than the current "choose your poison or roll your own" which is
what we have now. The proposed graph seems to be the best so far.

Previous message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Next message: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list