[dpdk-dev] [dpdk-dev v4 1/4] cryptodev: add symmetric crypto	data-path APIs
    Zhang, Roy Fan 
    roy.fan.zhang at intel.com
       
    Wed Jul  8 17:09:02 CEST 2020
    
    
  
Hi Akhil,
Thanks for the comments!
> -----Original Message-----
> From: Akhil Goyal <akhil.goyal at nxp.com>
> Sent: Tuesday, July 7, 2020 9:37 PM
> To: Zhang, Roy Fan <roy.fan.zhang at intel.com>; dev at dpdk.org;
> anoobj at marvell.com; asomalap at amd.com; ruifeng.wang at arm.com;
> Nagadheeraj Rottela <rnagadheeraj at marvell.com>; Michael Shamis
> <michaelsh at marvell.com>; Ankur Dwivedi <adwivedi at marvell.com>; Jay
> Zhou <jianjay.zhou at huawei.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch at intel.com>; Hemant Agrawal
> <hemant.agrawal at nxp.com>
> Cc: Trahe, Fiona <fiona.trahe at intel.com>; Bronowski, PiotrX
> <piotrx.bronowski at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Thomas Monjalon
> <thomas at monjalon.net>
> Subject: RE: [dpdk-dev v4 1/4] cryptodev: add symmetric crypto data-path
> APIs
> 
> Hi Fan,
> >
> > Hi Akhil,
> >
> > ...
> > > >
> > > > As you may have seen the structure definition is hW centric with the
> > > > IOVA addresses all over. Also as you will from the patch series the
> > > operation is
> > > > Per operation basis instead of operating in a burst. The external
> application
> > > > may sooner know when a specific enqueue is failed.
> > >
> > > You may also need to save a virtual address as well. As some hardware
> are
> > > able to
> > > Convert virtual to physical addresses on it's own giving a performance
> > > improvement.
> > >
> > > I do not see an issue in using enqueue burst with burst size=1 , but since
> you
> > > are doing
> > > Optimizations, none of the hardware can perform well with burst = 1, I
> think
> > > it is always
> > > Greater than 1.
> >
> > Shall I update the rte_crypto_sym_vec as the following - so the 2 problems
> can
> > be
> > resolved?
> >
> > struct rte_crypto_sym_vec {
> > 	/** array of SGL vectors */
> > 	struct rte_crypto_sgl *sgl;
> > 	union {
> > 		/* Supposed to be used with CPU crypto API call. */
> > 		struct {
> > 			/** array of pointers to IV */
> > 			void **iv;
> > 			/** array of pointers to AAD */
> > 			void **aad;
> > 			/** array of pointers to digest */
> > 			void **digest;
> > 			/**
> > 			 * array of statuses for each operation:
> > 			 *  - 0 on success
> > 			 *  - errno on error
> > 			 */
> > 			int32_t *status;
> > 		};
> >
> > 		/* Supposed to be used with HW crypto API call. */
> > 		struct {
> > 			/** array of pointers to IV */
> > 			struct rte_crypto_vec *iv_hw;
> > 			/** array of pointers to AAD */
> > 			struct rte_crypto_vec *aad_hw;
> > 			/** array of pointers to Digest */
> > 			struct rte_crypto_vec *digest_hw;
> > 		};
> >
> > 	};
> > 	/** number of operations to perform */
> > 	uint32_t num;
> > };
> 
> Yes something of that sort can work.
> 
Will change it in v5.
...
> >
> > > Have you done some profiling with using rte_crypto_op instead of this
> new
> > > struct?
> > >
> > Yes, the code are actually upstreamed in VPP
> > https://gerrit.fd.io/r/c/vpp/+/18036, please try out. If you
> > have a look at the
> > enqueue/dequeue functions you should see the struggle we had to
> translate
> > ops, and creating a second software ring to make sure we only dequeue a
> > frame of data. Lucky VPP has space to store mbufs otherwise the perf will
> > be even worse.
> What is the performance gap do you see after making m_src and m_dst as
> Raw buffers?
> 
Converting other projects data structure (such as VPP crypto op) into DPDK
cryptodev operation introduces some performance degradation.
> 
> There may be some cases where a limited amount of control pkts can be sent
> Which may be session less. We cannot ask people to use a different data
> path
> For such traffic. So we may need to support that too.
> 
Here is our proposal to the enqueue-dequeue API
typedef uint32_t (*cryptodev_sym_hw_dp_crypto_enqueue_dequeue_t)
	(struct rte_cryptodev *dev, uint16_t qp_id,
	struct rte_cryptodev_sym_session *sess,
	union rte_crypto_sym_ofs ofs, struct rte_crypto_sym_vec *vec,
	void **opaque, int *enqueued_num,
	rte_cryptodev_get_dequeue_count_t get_dequeue_count,
	rte_cryptodev_post_dequeue_t post_dequeue,
	uint32_t flags);
So the idea is a single API that does enqueue and/or dequeue combined.
If the user wants to do enqueue she/he should have 
RTE_CRYPTO_HW_DP_FF_DO_ENQUEUE in the flag, or 
RTE_CRYPTO_HW_DP_FF_DO_DEQUEUE if dequeue is expected to be done.
Opaque could be single pointer or an array, also specified by the flags, so
If the user wants to do same as cryptodev_enqueue they can stores the
Crypto ops into opaque_in, and the dequeued opaque will be stored in
Opaque_out. There are 2 function pointers 
rte_cryptodev_get_dequeue_count_t: return the number of jobs to dequeue,
which helps if the user wants to know the dequeue count from first
dequeued opaque data, or just return a fixed number same as cryptodev
enqueue/dequeue usage.
rte_cryptodev_post_dequeue_t: user provided function to operate post
dequeue, good to write status to user specified data structure (opaque).
To enable sessionless we may add the union number to replace sess. The
union is either a session pointer or xform pointer, may be specified by the
flag too. 
You may ask why using a single function pointer for both enqueue and
dequeue, instead 2 separate functions... I only intended to squeeze it into
rte_cryptodev_ops to combine it with cryptodev_sym_cpu_crypto_process_t
as a union, without expanding rte_cryptodev_ops size.
struct rte_cryptodev_ops {
	...
	cryptodev_asym_free_session_t asym_session_clear;
	/**< Clear a Crypto sessions private data. */
	union {
		cryptodev_sym_cpu_crypto_process_t sym_cpu_process;
		/**< process input data synchronously (cpu-crypto). */
		cryptodev_sym_hw_crypto_enqueue_dequeue_t sym_hw_enq_deq;
		/**< Get HW crypto data-path call back functions and data */
	};
};
> > Rte_security is built on top of mbuf and ethdev and is not intended to
> "mbuf-
> > independent"
> > applications either.
Again we can replacing sess to the union of 
union rte_cryptodev_hw_dp_ctx {
	struct rte_cryptodev_sym_session *crypto_sess;
	struct rte_crypto_sym_xform *xform;
	struct rte_security_session *sec_sess;
};
> 
> Rte_security for lookaside protocol can be mbuf independent and NXP may
> Support it in future especially in case of PDCP.
> 
> Regards,
> Akhil
What do you think?
Regards,
Fan
    
    
More information about the dev
mailing list