Unexpected behavior when using mbuf pool with external buffers
Michał Niciejewski
michal.niciejewski at codilime.com
Tue Jan 18 14:41:18 CET 2022
Hi,
based on the materials you provided I found that IOTLB is the
bottleneck (not the TLB). I am DMA mapping the memory so if I
understand correctly, only IOMMU is involved here. Below I post some
outputs from pcm tool:
10mpps, first run
IOTLB Hit - 25 M
IOTLB Miss - 10 M
15mpps
IOTLB Hit - 28 M
IOTLB Miss - 20 M
10mpps, second run
IOTLB Hit - 23 M
IOTLB Miss - 18 M
I also tested the same scenario on another, more powerful server, and
the results differ greatly:
10mpps, first run
IOTLB Hit - 36 M
IOTLB Miss - 644 K
25mpps (here I had to send more packets before drops appeared)
IOTLB Hit - 71 M
IOTLB Miss - 3860 K
10mpps, second run
IOTLB Hit - 36 M
IOTLB Miss - 1047 K
So the problems with mempool fragmentation are visible but it is not
that painful as in the first server. It looks like the first server is
much worse in terms of IOMMU than the second one. I disabled IOMMU and
used physical addresses to create an external buffer mempool
(https://gist.github.com/tropuq/55c334bf3a2ab86b89a0b59e42b8af08) and
it solved the performance issues.
I have some questions about these results:
1. Is there something wrong with IOMMU in the first server - could it
be that it's missing some additional configuration?
2. Is it normal to see that big differences?
3. Is there any way to find some info about IOMMU, like the size of IOTLB, etc.?
Thanks,
Michał Niciejewski
On Wed, Dec 22, 2021 at 5:30 PM Michał Niciejewski
<michal.niciejewski at codilime.com> wrote:
>
> Thank you for the replay,
>
> On Wed, Dec 22, 2021 at 11:24 AM Van Haaren, Harry
> <harry.van.haaren at intel.com> wrote:
> > I'll "top post" on this reply as the content is in HTML format below. In future, please try to send plain-text emails to DPDK mailing lists.
>
> I hope it's better now.
>
> > Estimating and talking is never conclusive – lets measure using Linux "Perf" tool. Run this command 3x, just like you posted the drop stats below.
> >
> > I expect to see lower dTLB-load-misses on the first run (no drops, 10 mpps), and that the dTLB misses are higher for 15 mpps *and* for 10 mpps again afterwards.
> >
> > perf stat -e cycles,dTLB-load-misses -C <datapath_lcore_here> -- sleep 1
>
> extbuf, aligned_alloc, 10mpps, first run
> Performance counter stats for 'CPU(s) 0':
> 2404553948 cycles
> 461 dTLB-load-misses
> 1.001938861 seconds time elapsed
>
> extbuf, aligned_alloc, 15mpps
> Performance counter stats for 'CPU(s) 0':
> 2404518710 cycles
> 466 dTLB-load-misses
> 1.001920171 seconds time elapsed
>
> extbuf, aligned_alloc, 10mpps, second run
> Performance counter stats for 'CPU(s) 0':
> 2402586106 cycles
> 449 dTLB-load-misses
> 1.001114692 seconds time elapsed
>
> I also checked what happens when there is no traffic at all and the
> results are similar:
>
> Performance counter stats for 'CPU(s) 0':
> 2949935339 cycles
> 465 dTLB-load-misses
> 1.002236168 seconds time elapsed
>
> Also, I checked how the application behaves when adding --no-huge
> option and using a normal mbuf pool. The results are very different
> compared to aligned_alloc + extbuf mbuf pool:
>
> 10mpps, --no-huge
> Performance counter stats for 'CPU(s) 0':
> 2402616160 cycles
> 17980033 dTLB-load-misses
> 1.001125954 seconds time elapsed
>
> Application logs:
> Queue: 0
> Number of all rx burst calls: 5757205
> Number of non-zero rx burst calls: 1073081
> Avg pkt nb received per rx burst: 1.7364
> All received pkts: 9996804
> All sent pkts: 8074460
> All dropped pkts: 1922344
--
Michał Niciejewski
Junior Software Engineer
michal.niciejewski at codilime.com
CodiLime Sp. z o.o. - Ltd. company with its registered office in
Poland, 02-493 Warsaw, ul. Krancowa 5.
Registered by The District Court for the Capital City of Warsaw, XII
Commercial Department of the National Court Register.
Entered into National Court Register under No. KRS 0000388871. Tax
identification number (NIP) 5272657478. Statistical number (REGON)
142974628.
--
-------------------------------
This document contains material that is
confidential in CodiLime Sp. z o.o. DO NOT PRINT. DO NOT COPY. DO NOT
DISTRIBUTE. If you are not the intended recipient of this document, be
aware that any use, review, retransmission, distribution, reproduction or
any action taken in reliance upon this message is strictly prohibited. If
you received this in error, please contact the sender and help at codilime.com
<mailto:help at codilime.com>. Return the paper copy, delete the material from
all computers and storage media.
More information about the users
mailing list