[dpdk-dev] [PATCH] net/af_xdp: enable support for unaligned umem chunks
Loftus, Ciara
ciara.loftus at intel.com
Mon Sep 2 10:48:11 CEST 2019
> Hi Ciara,
>
> I haven't tried this patch but have a question.
>
> On Thu, Aug 29, 2019 at 8:04 AM Ciara Loftus <ciara.loftus at intel.com> wrote:
> >
> > This patch enables the unaligned chunks feature for AF_XDP which
> > allows chunks to be placed at arbitrary places in the umem, as opposed
> > to them being required to be aligned to 2k. This allows for DPDK
> > application mempools to be mapped directly into the umem and in turn
> > enable zero copy transfer between umem and the PMD.
> >
> > This patch replaces the zero copy via external mbuf mechanism
> > introduced in commit e9ff8bb71943 ("net/af_xdp: enable zero copy by
> external mbuf").
> > The pmd_zero copy vdev argument is also removed as now the PMD will
> > auto-detect presence of the unaligned chunks feature and enable it if
> > so and otherwise fall back to copy mode if not detected.
> >
> > When enabled, this feature significantly improves single-core
> > performance of the PMD.
>
> Why using unaligned chunk feature improve performance?
> Existing external mbuf already has zero copy between umem and PMD, and
> your patch also does the same thing. So the improvement is from
> somewhere else?
Hi William,
Good question.
The external mbuf way indeed has zero copy however there's some additional complexity in that path in the management of the buf_ring.
For example on the fill/rx path, in the ext mbuf solution one must dequeue an addr from the buf_ring and add it to the fill queue, allocate an mbuf for the external mbuf, get a pointer to the data @ addr and attach the external mbuf. With the new solution, we allocate an mbuf from the mempool, derive the addr from the mbuf itself and add it to the fill queue, and then on rx we can simply cast the pointer to the data @ addr to an mbuf and return it to the user.
On tx/complete, instead of dequeuing from the buf_ring to get a valid addr we can again just derive it from the mbuf itself.
I've performed some testing to compare the old vs new zc and found that for the case where the PMD and IRQs are pinned to separate cores the difference is ~-5%, but for single-core case where the PMD and IRQs are pinned to the same core (with the need_wakeup feature enabled), or when multiple PMDs are forwarding to one another the difference is significant. Please see below:
ports queues/port pinning Δ old zc
1 1 0 -4.74%
1 1 1 17.99%
2 1 0 -5.62%
2 1 1 71.77%
1 2 0 114.24%
1 2 1 134.88%
FYI the series has been now merged into the bpf-next tree:
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=bdb15a29cc28f8155e20f7fb58b60ffc452f2d1b
Thanks,
Ciara
>
> Thank you
> William
>
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> > Signed-off-by: Kevin Laatz <kevin.laatz at intel.com>
> > ---
> > doc/guides/nics/af_xdp.rst | 1 -
> > doc/guides/rel_notes/release_19_11.rst | 9 +
> > drivers/net/af_xdp/rte_eth_af_xdp.c | 304 ++++++++++++++++++------
> -
> > 3 files changed, 231 insertions(+), 83 deletions(-)
> >
> <snip>
More information about the dev
mailing list