[dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Doherty, Declan declan.doherty at intel.com
Thu Feb 13 12:50:37 CET 2020


On 07/02/2020 2:18 PM, Jerin Jacob wrote:
> On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <david.coyle at intel.com> wrote:
>>
>> Hi Jerin, see below
> 
> Hi David,
> 
>>>
>>> On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.coyle at intel.com>
>>> wrote:
>>>
> 
>>>
>>> There is a risk in drafting API that meant for HW without any HW exists.
>>> Because there could be inefficiency on the metadata and fast path API for
>>> both models.
>>> For example, In the case of CPU based scheme, it will be pure overhead
>>> emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
>>> where CPU works better in the synchronous model and I have doubt that the
>>> session-based scheme will work for HW or not as both difference  HW needs
>>> to work hand in hand(IOMMU aspects for two PCI device)
>>
>> [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs.
>> In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient.
>> The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach.
> 
> In this release, we added CPU crypto support in cryptodev to support
> the synchronous model to fix the overhead.
> 
>>
>>  From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability -  the application doesn't need to know
>> the device type as such and therefore doesn't need to make different API calls.
>>
>> The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future.
>> As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto
>> is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane
>> pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks.
>>
>> We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development.
>>
>> Finally, I'm unsure what you mean by this line:
>>
>>          "I have doubt that the session-based scheme will work for HW or not as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI device)"
>>
>> What do mean by different HW working "hand in hand" and "two PCI device"?
>> The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain
> 
> I was thinking, it will be N PCIe devices that create the chain. Each
> distinct PCI device does the fixed-function and chains them together.
> 

The case we were looking at is more focused on a single  discrete 
(multi-function) device (from the perspective of the host) providing a 
number of transforms (operations) in a single pass rather than the case 
of N discrete hardware devices (from the perspective of the host) 
chained together to achieve the same transforms set.


> I do understand the usage of QAT HW and CRC in SW.
> So If I understand it correctly, in rte_security, we are combining
> rte_ethdev and rte_cryptodev. With this spec, we are trying to
> combine,
> rte_cryptodev and rte_compressdev. So it looks good to me. My only
> remaining concern is the name of this API, accelerator too generic
> name. IMO, like rte_security, we may need to give more meaningful name
> for the use case where crytodev and compressdev can work together.
> 



More information about the dev mailing list