[dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API

Ananyev, Konstantin konstantin.ananyev at intel.com
Mon Oct 21 15:47:53 CEST 2019
Previous message: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
Next message: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Akhil,

 
> Added my comments inline with your draft.
> >
> >
> > Hi Akhil,
> >
> > > > BTW, to be honest, I don't consider current rte_cryptodev_sym_session
> > > > construct for multiple device_ids:
> > > > __extension__ struct {
> > > >                 void *data;
> > > >                 uint16_t refcnt;
> > > >         } sess_data[0];
> > > >         /**< Driver specific session material, variable size */
> > > >
> > > Yes I also feel the same. I was also not in favor of this when it was introduced.
> > > Please go ahead and remove this. I have no issues with that.
> >
> > If you are not happy with that structure, and admit there are issues with it,
> > why do you push for reusing it for cpu-crypto API?
> > Why  not to take step back, take into account current drawbacks
> > and define something that (hopefully) would suite us better?
> > Again new API will be experimental for some time, so we'll
> > have some opportunity to see does it works and if not fix it.
> 
> [Akhil] This structure is serving some use case which is agreed upon in the
> Community, we cannot just remove a feature altogether.

I understand that, but we don't suggest to remove anything that already here.
We are talking about extending existing/adding new API.  
All our debates around how much we can reuse from existing one and what new
needs to be added.

> Rather it is Intel's  Use case only.
> 
> >
> > About removing data[] from existing rte_cryptodev_sym_session -
> > Personally would like to do that, but the change seems to be too massive.
> > Definitely not ready for such effort right now.
> >
> 
> [snip]..
> 
> >
> > Ok, then my suggestion:
> > Let's at least write down all points about crypto-dev approach where we
> > disagree and then probably try to resolve them one by one....
> > If we fail to make an agreement/progress in next week or so,
> > (and no more reviews from the community)
> > will have bring that subject to TB meeting to decide.
> > Sounds fair to you?
> Agreed
> >
> > List is below.
> > Please add/correct me, if I missed something.
> >
> > Konstantin
> 
> Before going into comparison, we should define the requirement as well.

Good point.

> What I understood from the patchset,
> "You need a synchronous API to perform crypto operations on raw data using SW PMDs"
> So,
> - no crypto-ops,
> - no separate enq-deq, only single process API for data path
> - Do not need any value addition to the session parameters.
>   (You would need some parameters from the crypto-op which
>    Are constant per session and since you wont use crypto-op,
>    You need some place to store that)

Yes, this is correct, I think.

> 
> Now as per your mail, the comparison
> 1. extra input parameters to create/init rte_(cpu)_sym_session.
> 
> Will leverage existing 6B gap inside rte_crypto_*_xform between 'algo' and 'key' fields.
> New fields will be optional and would be used by PMD only when cpu-crypto session is requested.
> For lksd-crypto session PMD is free to ignore these fields.
> No ABI breakage is required.
> 
> [Akhil] Agreed, no issues.
> 
> 2. cpu-crypto create/init.
>     a) Our suggestion - introduce new API for that:
>         - rte_crypto_cpu_sym_init() that would init completely opaque  rte_crypto_cpu_sym_session.
>         - struct rte_crypto_cpu_sym_session_ops {(*process)(...); (*clear); /*whatever else we'll need *'};
>         - rte_crypto_cpu_sym_get_ops(const struct rte_crypto_sym_xform *xforms)
>           that would return const struct rte_crypto_cpu_sym_session_ops *based on input xforms.
> 	Advantages:
> 	1)  totally opaque data structure (no ABI breakages in future), PMD writer is totally free
> 	     with it format and contents.
> 
> [Akhil] It will have breakage at some point till we don't hit the union size.

Not sure, what union you are talking about?

> Rather I don't suspect there will be more parameters added.
> Or do we really care about the ABI breakage when the argument is about
> the correct place to add a piece of code or do we really agree to add code
> anywhere just to avoid that breakage.

I am talking about maintaining it in future.
if your struct is not seen externally, no chances to introduce ABI breakage. 

> 
> 	2) each session entity is self-contained, user doesn't need to bring along dev_id etc.
> 	    dev_id is needed  only at init stage, after that user will use session ops to perform
> 	    all operations on that session (process(), clear(), etc.).
> 
> [Akhil] There is nothing called as session ops in current DPDK.

True, but it doesn't mean we can't/shouldn't have it.

> What you are proposing
> is a new concept which doesn't have any extra benefit, rather it is adding complexity
> to have two different code paths for session create.
> 
> 
> 	3) User can decide does he wants to store ops[] pointer on a per session basis,
> 	    or on a per group of same sessions, or...
> 
> [Akhil] Will the user really care which process API should be called from the PMD.
> Rather it should be driver's responsibility to store that in the session private data
> which would be opaque to the user. As per my suggestion same process function can
> be added to multiple sessions or a single session can be managed inside the PMD.

In that case we either need to have a function per session (stored internally),
or make decision (branches) at run-time.
But as I said in other mail - I am ok to add small shim structure here:
either rte_crypto_cpu_sym_session { void *ses; struct rte_crypto_cpu_sym_session_ops ops; }
or rte_crypto_cpu_sym_session { void *ses; struct rte_crypto_cpu_sym_session_ops *ops; } 
And merge rte_crypto_cpu_sym_init() and rte_crypto_cpu_sym_get_ops() into one (init).

> 
> 
> 	4) No mandatory mempools for private sessions. User can allocate memory for cpu-crypto
> 	    session whenever he likes.
> 
> [Akhil] you mean session private data? 

Yes.

> You would need that memory anyways, user will be
> allocating that already.  You do not need to manage that.

What I am saying - right now user has no choice but to allocate it via mempool.
Which is probably not the best options for all cases.

> 
> 	Disadvantages:
> 	5) Extra changes in control path
> 	6) User has to store session_ops pointer explicitly.
> 
> [Akhil] More disadvantages:
> - All supporting PMDs will need to maintain TWO types of session for the
> same crypto processing. Suppose a fix or a new feature(or algo) is added, PMD owner
> will need to add code in both the session create APIs. Hence more maintenance and
> error prone.

I think majority of code for both paths will be common, plus even we'll reuse current sym_session_init() -
changes in PMD session_init() code will be unavoidable. 
But yes, it will be new entry in devops, that PMD will have to support.
Ok to add it as 7) to the list.

> - Stacks which will be using these new APIs also need to maintain two
> code path for the same processing while doing session initialization
> for sync and async

That's the same as #5 above, I think.

> 
> 
>      b) Your suggestion - reuse existing rte_cryptodev_sym_session_init() and existing rte_cryptodev_sym_session
>       structure.
> 	Advantages:
> 	1) allows to reuse same struct and init/create/clear() functions.
> 	    Probably less changes in control path.
> 	Disadvantages:
> 	2) rte_cryptodev_sym_session. sess_data[] is indexed by driver_id, which means that
> 	    we can't use the same rte_cryptodev_sym_session to hold private sessions pointers
> 	    for both sync and async mode  for the same device.
>                    So the only option we have - make PMD devops->sym_session_configure()
> 	    always create a session that can work in both cpu and lksd modes.
> 	    For some implementations that would probably mean that under the hood  PMD would create
> 	    2 different session structs (sync/async) and then use one or another depending on from what API been called.
> 	    Seems doable, but ...:
>                    - will contradict with statement from 1:
> 	      " New fields will be optional and would be used by PMD only when cpu-crypto session is requested."
>                       Now it becomes mandatory for all apps to specify cpu-crypto related parameters too,
> 	       even if they don't plan to use that mode - i.e. behavior change, existing app change.
>                      - might cause extra space overhead.
> 
> [Akhil] It will not contradict with #1, you will only have few checks in the session init PMD
> Which support this mode, find appropriate values and set the appropriate process() in it.
> User should be able to call, legacy enq-deq as well as the new process() without any issue.
> User would be at runtime will be able to change the datapath.
> So this is not a disadvantage, it would be additional flexibility for the user.

Ok, but that's what I am saying - if PMD would *always* have to create a session that can handle
both modes (sync/async), then user would *always* have to provide parameters for both modes too.
Otherwise if let say user didn't setup sync specific parameters at all, what PMD should do?
  - return with error?
  - init session that can be used with async path only?
My current assumption is #1.
If #2, then how user will be able to distinguish is that session valid for both modes, or only for one? 


> 
> 
> 	3) not possible to store device (not driver) specific data within the session, but I think it is not really needed right now.
> 	    So probably minor compared to 2.b.2.
> 
> [Akhil] So lets omit this for current discussion. And I hope we can find some way to deal with it.

I don't think there is an easy way to fix that with existing API.

> 
> 
> Actually #3 follows from #2, but decided to have them separated.
> 
> 3. process() parameters/behavior
>     a) Our suggestion: user stores ptr to session ops (or to (*process) itself) and just does:
>         session_ops->process(sess, ...);
> 	Advantages:
> 	1) fastest possible execution path
> 	2) no need to carry on dev_id for data-path
> 
> [Akhil] I don't see any overhead of carrying dev id, at least it would be inline with the
> current DPDK methodology.

If we'll add process() into rte_cryptodev itself (same as we have enqueue_burst/dequeue_burst),
then it will be an ABI breakage.
Also there are discussions to get rid of that approach completely:
http://mails.dpdk.org/archives/dev/2019-September/144674.html
So I am not sure this is a recommended way these days.

> What you are suggesting is a new way to get the things done without much benefit.

Would help with ABI stability plus better performance, isn't it enough?

> Also I don't see any performance difference as crypto workload is heavier than
> Code cycles, so that wont matter.

It depends.
Suppose function call costs you ~30 cycles.
If you have burst of big packets (let say crypto for each will take ~2K cycles) that belong
to the same session, then yes you wouldn't notice these extra 30 cycles at all.
If you have burst of small packets (let say crypto for each will take ~300 cycles)  each
belongs to different session, then it will cost you ~10% extra.

> So IMO, there is no advantage in your suggestion as well.
> 
> 
> 	Disadvantages:
> 	3) user has to carry on session_ops pointer explicitly
>     b) Your suggestion: add  (*cpu_process) inside rte_cryptodev_ops and then:
>         rte_crypto_cpu_sym_process(uint8_t dev_id, rte_cryptodev_sym_session  *sess, /*data parameters*/) {...
>                      rte_cryptodevs[dev_id].dev_ops->cpu_process(ses, ...);
>                       /*and then inside PMD specifc process: */
>                      pmd_private_session = sess->sess_data[this_pmd_driver_id].data;
>                      /* and then most likely either */
>                      pmd_private_session->process(pmd_private_session, ...);
>                      /* or jump based on session/input data */
> 	Advantages:
> 	1) don't see any...
> 	Disadvantages:
> 	2) User has to carry on dev_id inside data-path
> 	3) Extra level of indirection (plus data dependency) - both for data and instructions.
> 	    Possible slowdown compared to a) (not measured).
> 
> Having said all this, if the disagreements cannot be resolved, you can go for a pmd API specific
> to your PMDs,

I don't think it is good idea.
PMD specific API is sort of deprecated path, also there is no clean way to use it within the libraries.

> because as per my understanding the solution doesn't look scalable to other PMDs.
> Your approach is aligned only to Intel , will not benefit others like openssl which is used by all
> vendors.

I feel quite opposite, from my perspective majority of SW backed PMDs will benefit from it.
And I don't see anything Intel specific in my proposals above. 
About openssl PMD: I am not an expert here, but looking at the code, I think it will fit really well.
Look yourself at its internal functions: process_openssl_auth_op/process_openssl_crypto_op,
I think they doing exactly the same - they use sync API underneath, and they are session based
(AFAIK you don't need any device/queue data, everything that needed for crypto/auth is stored inside session).

Konstantin
Previous message: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
Next message: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list