[RFC] Define well known packet burst sizes
Morten Brørup
mb at smartsharesystems.com
Sat Oct 12 17:07:03 CEST 2024
> From: Pavan Nikhilesh Bhagavatula [mailto:pbhagavatula at marvell.com]
> Sent: Saturday, 12 October 2024 15.55
>
> > We should define some "well known" packet burst sizes in
> rte_config.h.
> >
>
> A while back we had the same idea, except that it should be platform
> specific
> Ex., for CNXK optimal burst size across workloads is 64.
>
> Instead of rte_config.h maybe we should have it as a meson option in
> meson_options.txt with default as 32.
>
> That way platforms can also modify it in config/<arch>/meson.build
Agree. The default burst size should be platform specific for optimal performance. The "generic" platform value can be 32.
And applications with different needs can use a meson option to override the platform specific default.
The other two suggested "well known" (intended to be commonly used) burst sizes can reside in rte_config.h.
>
> > Especially the default packet burst size is interesting;
> > if known at compile time, various drivers and libraries can optimize
> for it (i.e.
> > special handling for nb_pkts == RTE_PKT_BURST_DEFAULT).
> >
> > It should also be used in DPDK examples and apps, instead of defining
> > MAX_PKT_BURST in each and everyone.
> >
> >
> > Specifically:
> >
> > /**
> > * Default packet burst size.
> > *
> > * Also intended for optimizing packet processing (e.g. by loop
> unrolling).
> > */
> > #define RTE_PKT_BURST_DEFAULT 32
> >
> > /**
> > * Largest packet burst size guaranteed to be supported throughout
> DPDK.
> > *
> > * Also intended for sizing large temporary arrays of mbufs, e.g. in
> > rte_pktmbuf_free_bulk().
> > */
> > #define RTE_PKT_BURST_MAX 512
> > #define RTE_MEMPOOL_CACHE_MAX_SIZE RTE_PKT_BURST_MAX
> >
> > /**
> > * Smallest packet burst size recommended for latency sensitive
> applications,
> > when throughput still matters.
> > *
> > * Also intended for sizing small staging arrays of mbufs, e.g. in
> drivers.
> > *
> > * Note: Corresponds to one CPU cache line of object pointers.
> > * - 8 on most 64 bit architectures, 16 on POWER architecture
> (ppc_64).
> > * - 16 on all 32 bit architectures.
> > */
> > #define RTE_PKT_BURST_SMALL (RTE_CACHE_LINE_SIZE / sizeof(void *))
More information about the dev
mailing list