[dpdk-users] Mbuf pool/ring size question
mason+dpdk at steelypip.org
Thu Jul 26 22:09:20 CEST 2018
I've got a question about mbuf pool and ring sizes - DPDK 17.02 PMD.
I've got a pipelined application running with RSS on a Cavium CN83XX.
40GE, 4 RSS queues wide and a pipeline 3 deep, ISOLCPUs with only DPDK
running on each of the 12 worker cores. There are two RTE SP/SC rings
per RSS queue for communication between the pipeline stages - the
rings are 1024 deep, 512 cache, and an mbuf pool of 16K-1.
Performance is generally good - 40G in and 40G out with 1M flows of
512 byte packets, EXCEPT for intermittent drops on the order of a few
dozen to a few hundred packets/second. I did some timing measurements
and found that sometimes a packet can take much longer to get through
the pipeline, despite being identical (except for destination address)
and taking an identical(ish) code path - sometimes two to three orders
of magnitude longer.
I tried measuring where the extra time was going, but pretty much
everything I tried perturbed the system, so I wasn't easily able to
get a clear answer. One of my suspicions is the per-lcore mbuf cache
flush/fill, since the rx and tx are being done by different cores. Is
there an efficient way to manage the mbuf pool in this case than
rte_pktmbuf_pool_create? Some cores don't allocate or free mbufs, so
I'm also curious if I'm losing mbufs to the caches on those cores.
Since I have memory to burn I figured I could absorb any glitches by
increasing the RX/TX descriptor pool, mbuf pool, and ring sizes,
allowing more packets to be buffered during the glitches. This didn't
help, which I guess makes sense if my issue is lock contentioon on the
mbuf cache, which I can't make larger. Almost all of the DPDK
examples and applications I could find use roughly the same parameters
- 128-512 buffer descriptors, 4-16K mbuf pool, 1K ring sizes, etc. It
seems that there are diminishing returns for increasing much beyond
these values, why is that?
More information about the users