dpdk Tx falling short
Stephen Hemminger
stephen at networkplumber.org
Sat Jul 5 21:08:34 CEST 2025
On Sat, 5 Jul 2025 17:36:08 +0000
"Lombardo, Ed" <Ed.Lombardo at netscout.com> wrote:
> Hi Stephen,
> I saw your response to more mempools and cache behavior.
>
> I have a goal to support 2x100G next, and if I can't get 10G with DPDK then something is seriously wrong.
>
> Should I build the dpdk static libraries with LTO?
>
> Thanks,
> Ed
Are you doing anything in the fast path that is an obvious cache miss.
at 10Gbit/sec and size of 84 bytes = 67.2ns
CPU's haven't got that much faster 3G cpu that is 201 cycles.
Single cache miss is 32ns, so two cache misses means per-packet budget is gone.
Obvious cache misses.
- passing packets to worker with ring
- using spinlocks (cost 16ns)
- fetching TSC
- syscalls?
Also, never ever use floating point.
Kernel related and older but worth looking at:
https://people.netfilter.org/hawk/presentations/LCA2015/net_stack_challenges_100G_LCA2015.pdf
More information about the users
mailing list