[PATCH 9/9] net/dpaa2: drop the fake software VLAN strip offload
Maxime Leroy
maxime at leroys.fr
Fri Jun 12 07:42:59 CEST 2026
Le jeu. 11 juin 2026, 19:31, Stephen Hemminger <stephen at networkplumber.org>
a écrit :
> On Thu, 11 Jun 2026 17:49:24 +0200
> Maxime Leroy <maxime at leroys.fr> wrote:
>
> > It saves a forwarding application nothing: the datapath reads the L2
> > header anyway to classify or strip. The offload does not remove that
> > read, it relocates it into the driver Rx burst, where it is far more
> > expensive.
> >
> > The cost is a matter of timing. rte_vlan_strip() reaches the L2 header
> > through rte_pktmbuf_mtod(), which dereferences mbuf->buf_addr. On a
> > freshly recycled buffer that mbuf cacheline is cold. eth_fd_to_mbuf()
> > has just written other fields of it (data_off, ol_flags), but buf_addr
> > is a persistent field it does not rewrite. A write does not stall: it
> > posts to the store buffer while the line fills in the background, and
> > the rewritten fields are forwarded straight from there. buf_addr has
> > nothing to forward, so it must be read from the line, whose fill is
> > still in flight, and the read stalls. The ethertype read that follows,
> > on the cold payload line, stalls again. Read later by the application,
> > when the fill has completed, the same read hits. The offload just
> > performs it at the worst possible moment.
> >
> > Measured on a single-core port-to-port forwarding test over two 10G
> > ports (one core at 2 GHz, 64-byte untagged frames):
> >
> > - throughput 4.22 -> 5.00 Mpps (+18 percent)
> > - IPC 0.93 -> 1.25: the cost was memory stall, not compute
> > - L3/DRAM-bound L2 refills 319M -> 200M over 10s (-37 percent)
> >
> > perf confirms it: with the offload, the buf_addr load (the cold mbuf
> > field) and the payload load account for about 84 percent of the Rx
> > burst's L2 refills; removing it, those vanish and only the inherent DQRR
> > dequeue misses remain.
> >
> > Stop advertising VLAN_STRIP and remove the rte_vlan_strip() calls from
> > every Rx path. This is a behavioural change: the tag is left in the
> > frame, so an application must strip it itself, on the L2 header it
> > already reads.
> >
> > Signed-off-by: Maxime Leroy <maxime at leroys.fr>
> > ---
>
> In general I agree, but you overstate the impact. Any real application
> is going to look at the mbuf anyway. Relying on testpmd numbers is BS.
>
> The NBL driver does the same thing.
> So does PCAP but it has no choice, and is slow anyway.
> Virtio/vhost does as well.
This was not measured with testpmd, but with Grout in I/O forwarding mode.
The comparison is exactly between Grout's software fallback and the
advertised offload path. Without VLAN_STRIP, Grout's rx_process() reads the
Ethernet header and strips the VLAN tag itself if needed. With VLAN_STRIP
enabled, Grout uses rx_offload_process(), which only consumes
RTE_MBUF_F_RX_VLAN_STRIPPED/vlan_tci and does not inspect the Ethernet
header
for VLAN stripping.
For dpaa2, however, VLAN_STRIP is not done by the device. The PMD
implements the advertised offload by calling rte_vlan_strip() in the Rx
burst path. So enabling the "offload" just moves the same software work
from Grout into the driver.
The cost is timing. rte_vlan_strip() calls rte_pktmbuf_mtod(), which needs
mbuf->buf_addr. That value is persistent mbuf metadata; it is not produced
by the FD-to-mbuf conversion. eth_fd_to_mbuf() has just written other mbuf
fields such as data_off and ol_flags; those writes can be posted or
forwarded, but
they do not provide buf_addr. If the mbuf cacheline is cold, the buf_addr
load
has to wait for that line to be fetched before the driver can reach the
Ethernet header.
Grout does the same L2 read later in rx_process(), where it is already
processing L2. So the fake PMD offload performs the same software fallback,
but injects an extra mbuf-metadata dependency at a worse point in the Rx
burst path.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20260612/251911e1/attachment.htm>
More information about the dev
mailing list