[dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
arkadiuszx.kusztal at intel.com
Mon May 15 13:52:08 CEST 2017
Sorry for delayed answer.
As for QAT.
2) Allocates memory for qat PMD op cookie pointer: 16384 bytes
There will be as many cookie pointers as there is nb_descriptors (so in this case 2048).
Cookie pointer struct will take 704 bytes, 256 is needed for buffers in in-place operation and 256 for out-of-place + 16 bytes which will be padded due to 64B alignment constraint to 320 bytes and then align (320 + 320 + 16) = 704 bytes.
So it will be 704 * 2048 = 1441792B.
Qat_session size should be of 568 bytes.
From: Chinmaya Dwibedy [mailto:ckdwibedy at gmail.com]
Sent: Wednesday, May 10, 2017 10:36 AM
To: Trahe, Fiona <fiona.trahe at intel.com>
Cc: users at dpdk.org; Kusztal, ArkadiuszX <arkadiuszx.kusztal at intel.com>
Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
Thanks a lot for your valuable feedback. Once again I reviewed the code and figured out the memory requirement for crypto device. Kindly review the below stated, feel free to suggest if something is wrong.
AESNI SW Crypto Device (dpdk 17.02)
During initialization of AESNI vdev (via rte_eal_vdev_init (), called by DPDK application)
1) Allocates memzone for cryptodev data structure: 128 bytes.
2) Allocates memzone for cryptodev device private data: 12 bytes
During configuration of a device (via rte_cryptodev_configure ())
1) Allocates memzone for queue_pairs meta data: 8 byes.
2) Allocates memory required for session mempool: 2048*848= 1736704 bytes
(Note: Size of element: 848 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048)
During queue pair setup (via rte_cryptodev_queue_pair_setup ())
1) Allocates memzone for queue pair data structure: 52928 bytes.
2) Allocates memory for ring ( to place processed operations on) : 52928 bytes
Total memory required per AESNI vdev: 1842708 bytes (1.757MB)
QAT HW Crypto Device (dpdk 17.02)
During initialization of QAT device (via rte_cryptodev_pci_probe(),QAT devices are discovered during the PCI probe of the EAL function which is executed at DPDK initialization)
1) Allocates memzone for cryptodev data structure: 128 bytes.
2) Allocates memzone for cryptodev device private data: 80 bytes
During configuring a device (via rte_cryptodev_configure ())
1) Allocate memzone for queue_pairs meta data: 8 byes.
2) Allocates memory required for session mempool:: 2048*592= 1212416 bytes
(Note: Size of element: 592 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048)
During setting up a queue pair (via rte_cryptodev_queue_pair_setup ())
1) Allocates memzone for queue pair data structure:: 320 bytes.
2) Allocates memory for qat PMD op cookie pointer: 16384 bytes.
3) Allocates memory for Tx queue: 262144 bytes.
4) Allocates memory for Rx queue: 65536 bytes
Total memory required per QAT device: 1557008 bytes (1.484MB)
On Tue, May 9, 2017 at 9:22 PM, Trahe, Fiona <fiona.trahe at intel.com<mailto:fiona.trahe at intel.com>> wrote:
> -----Original Message-----
> From: users [mailto:users-bounces at dpdk.org<mailto:users-bounces at dpdk.org>] On Behalf Of Chinmaya Dwibedy
> Sent: Monday, May 8, 2017 2:54 PM
> To: users at dpdk.org<mailto:users at dpdk.org>
> Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and
> AESNI) (using DPDK-17.02)
> Can anyone please respond to this email ? Thank you in advance for your
> suggestion and time.
> On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy at gmail.com<mailto:ckdwibedy at gmail.com>>
> > Hi All,
> > We are using DPK-17.02. We use crypto via hardware (QAT) and software
> > acceleration (AESNI). There is one to one mapping between crypto Dev and
> > worker core. What are the memory requirements for the below stated
> > 1) Creation of one physical Crypto device.
> > 2) Creation of one AESNI MB virtual Crypto device.
> > Thereafter we configure a device with the default number of queue pairs to
> > set up for the device as shown below.
> > #define CDEV_MP_CACHE_SZ 64
> > rte_cryptodev_info_get(cdev_id, &info);
> > dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
> > dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
> > dev_conf.socket_id = SOCKET_ID_ANY;
> > dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
> > rte_cryptodev_configure (cdev_id, &dev_conf);
> > How to calculate the minimum memory required to configure per HW and per
> > SW crypto device. Then we allocate and set up a receive queue pair for a
> > device as follows. As of now we use one queue per device and number of
> > descriptors per queue pair is set to 2k. If we increase the number of
> > descriptors, will it improve the performance in terms of throughput?
The QAT device can serve only a certain number of requests in parallel
which is far smaller than 2k. So increasing number of descriptors
won't speed up throughput. In fact 2k is probably excessive and could
lead to longer latency if the queue is being filled up.
I would suggest trying values of 1k, 512 and 256 and if you see no reduction in
reduction in throughput you can use a smaller queue and save some memory.
The optimal size partly depends on how bursty your traffic is.
> > #define CDEV_MP_NB_OBJS 2048
> > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
> > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, dev_conf.socket_id)
[Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device).
TX queue = qp_conf.nb_descriptors * 128 bytes
RX queue = qp_conf.nb_descriptors * 32 bytes
op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes
op mempool size is totally up to the user and is not bound to any device or PMD.
Session mempool is per device (though this will change in 17.08)
QAT session struct is 576 bytes long + memory for
bpi_ctx and inst pointers.
Number of sessions in the pool are passed in to rte_cryptodev_configure().
This should be <= max_nb_sessions for that device which can be queried using
> > We create a session for symmetric cryptographic operations per IPsec
> > Security association. What is the memory required to hold session data
> > structure?
> > The intent behind this is to calculate the memory requirements in advance
> > (before EAL initialization) and based upon the available memory, figure out
> > how many crypto devices (note: our application initializes AESNI vdev
> > without using EAL command line option) can be initialized? Say there are 24
> > worker cores and we need 24 crypto AESNI vdevs. But there is no sufficient
> > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we will
> > allocate more hugepages , then call rte_eal_init() and expect it to be
> > passed.
> > Thank you in advance for your suggestion and time.
> > Regards,
> > Chinmaya
More information about the users