[dpdk-dev] [RFC 0/4] cpu-crypto API choices

Ananyev, Konstantin konstantin.ananyev at intel.com
Mon Nov 18 12:57:07 CET 2019


Hi Jerin,

Thanks for input, my answers inline.
Other guys - please provide your input.
Thanks
Konstantin

> > Originally both SW and HW crypto PMDs use rte_crypot_op based API to
> > process the crypto workload asynchronously. This way provides uniformity
> > to both PMD types, but also introduce unnecessary performance penalty to
> > SW PMDs that have to "simulate" HW async behavior
> > (crypto-ops enqueue/dequeue, HW addresses computations,
> > storing/dereferencing user provided data (mbuf) for each crypto-op,
> > etc).
> >
> > The aim is to introduce a new optional API for SW crypto-devices
> > to perform crypto processing in a synchronous manner.
> > As summarized by Akhil, we need a synchronous API to perform crypto
> > operations on raw data using SW PMDs, that provides:
> >  - no crypto-ops.
> >  - avoid using mbufs inside this API, use raw data buffers instead.
> >  - no separate enqueue-dequeue, only single process() API for data path.
> >  - input data buffers should be grouped by session,
> >    i.e. each process() call takes one session and group of input buffers
> >    that  belong to that session.
> >  - All parameters that are constant accross session, should be stored
> >    inside the session itself and reused by all incoming data buffers.
> >
> > While there seems no controversy about need of such functionality,
> > there seems to be no agreement on what would be the best API for that.
> > So I am requesting for TB input on that matter.
> >
> > Series structure:
> > - patch #1 - intorduce basic data structures to be used by sync API
> >   (no controversy here, I hope ..)
> >   [RFC 1/4] cpu-crypto: Introduce basic data structures
> > - patch #2 - Intel initial approach for new API (via rte_security)
> >   [RFC 2/4] security: introduce cpu-crypto API
> > - patch #3 - approach that reuses existing rte_cryptodev API as much as
> >   possible
> >   [RFC 3/4] cryptodev: introduce cpu-crypto API
> > - patch #4 - approach via introducing new session data structure and API
> >   [RFC 4/4] cryptodev: introduce rte_crypto_cpu_sym_session API
> >
> > Patches 2,3,4 are mutually exclusive,
> > and we probably have to choose which one to go forward with.
> > I put some explanations in each of the patches, hopefully that will help
> > to  understand pros and cons of each one.
> >
> > Akhil strongly supports #3, AFAIK mainly because it allows PMDs to
> > reuse existing API and minimize API level changes.
> > My favorite is #4, #2 is less preferable but ok too.
> > #3 seems problematic to me by the reasons I outlined in #4 patch
> > description.
> >
> > Please provide your opinion.
> 
> I spend some time on the proposal and I agree that sync API is needed
> and it makes sense to remove queue emulation and allocating/freeing
> the crypto_ops
> in case of sync API.
> 
> # I would prefer to not duplicate the session. If the newly added
> fields are for optimization
> then those can be applicable for HW too. For example, if we consider,
> offset to be
> constant for one session HW PMD will be able to leverage this. ref:
> rte_crypto_aead_xfrom::cpu_crypto:offset

It might, but right for async API we pass this info in crypto_op instead.
So if I get you right your preference is sort of #3 approach
that reuses existing rte_cryptodev API as much as possible:
reuse existing rte_cryptodev_sym structure with new sync process() API?
 
> # I would prefer to not duplicate ops parameters, instead of the
> existing rte_crypto_ops  can be updated.
> I see that most members introduced in rte_crypto_sym_vec &
> rte_crypto_vec are already existing in rte_crypto_op.

rte_crypto_ops is way too generic/excessive.
Filling/reading it seems one of the main slowdowns that  we trying to
avoid in new API. 

> 
> Also, since we are agreeing that the ops for SYNC API can be from
> stack/one time allocated, the size shouldn't matter.

I can be on stack, but it means user will still have to fill them
and PMD will have to read/process/overwrite them. 
 
> I understand that this would cause ABI breakage, but for this release,
> we can work together and add some reserved fields
> that we can implement later. I believe that's the reason why you want
> to introduce new structures. I think that will bloat
> the existing crypto lib.

It will increase the lib code, but I don't think it will be significant.
Honestly, I think messing with crypto_op and other existing structures
might have much more negative effect. 
 
> If I understand it correctly, this will be used in conjunction with
> IXGBE to handle fragmented IPsec traffic. If that's the fundamental
> reasoning, then there is an alternate path possible.

No, it's just one of the use-case.
Pretty important, but not the only one.
The main reason - current cryptodev API (crypto_op based) is suboptimal for SW based PMDs.
We wasting too many cycles to pretend that it is a lookaside device underneath.
I think makes more sense to admit that it is SW based and exploit it nature,
instead of trying to hide it.

> Currently, the  issue is, rte_security doesn't define the treatment for fragmented
> packets. Maybe let's define it and then a similar CPU crypto
> processing can be done inside the PMD. By creating an internal
> function in S/W PMDs and calling it from the inline crypto enabled eth
> PMDs, fragmentation support for inline crypto devices can
> be achieved. This way the application would look clean. All the
> fragmentation related configuration (no of fragmentation contexts
> needed,
> reassembly timeout etc) need to be added in rte_security library and
> the result for that operation will come as dynamic fields in the mbuf.
> 
> Just my 2c.
> 
 



More information about the dev mailing list