[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Jerin Jacob jerinjacobk at gmail.com
Fri Feb 7 15:18:17 CET 2020

On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <david.coyle at intel.com> wrote:
> Hi Jerin, see below

Hi David,

> >
> > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.coyle at intel.com>
> > wrote:
> >

> >
> > There is a risk in drafting API that meant for HW without any HW exists.
> > Because there could be inefficiency on the metadata and fast path API for
> > both models.
> > For example, In the case of CPU based scheme, it will be pure overhead
> > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > where CPU works better in the synchronous model and I have doubt that the
> > session-based scheme will work for HW or not as both difference  HW needs
> > to work hand in hand(IOMMU aspects for two PCI device)
> [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs.
> In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient.
> The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach.

In this release, we added CPU crypto support in cryptodev to support
the synchronous model to fix the overhead.

> From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability -  the application doesn't need to know
> the device type as such and therefore doesn't need to make different API calls.
> The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future.
> As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto
> is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane
> pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks.
> We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development.
> Finally, I'm unsure what you mean by this line:
>         "I have doubt that the session-based scheme will work for HW or not as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI device)"
> What do mean by different HW working "hand in hand" and "two PCI device"?
> The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain

I was thinking, it will be N PCIe devices that create the chain. Each
distinct PCI device does the fixed-function and chains them together.

I do understand the usage of QAT HW and CRC in SW.
So If I understand it correctly, in rte_security, we are combining
rte_ethdev and rte_cryptodev. With this spec, we are trying to
rte_cryptodev and rte_compressdev. So it looks good to me. My only
remaining concern is the name of this API, accelerator too generic
name. IMO, like rte_security, we may need to give more meaningful name
for the use case where crytodev and compressdev can work together.

More information about the dev mailing list