[PATCH v7 0/4] net/zxdh: optimize Rx/Tx path performance
Stephen Hemminger
stephen at networkplumber.org
Tue Jun 23 17:54:23 CEST 2026
On Tue, 23 Jun 2026 14:09:04 +0800
Junlong Wang <wang.junlong1 at zte.com.cn> wrote:
> v7:
> - Add a new xmit prepare func for xmit_pkts_simple, which will checked the size of
> ZXDH_DL_NET_HDR_SIZE and RTE_PKTMBUF_HEADROOM.
>
> v6:
> - Remove unnecessary error checking code in submit_to_backend_simple() and
> pkt_padding(). Since as the max dl_net_hdr_len is always less than
> RTE_PKTMBUF_HEADROOM, rte_pktmbuf_prepend() cannot fail in the
> simple path (single-segment mbufs).
> v5:
> - Reorganize patch series, placing interrupt fix as the first patch
> and fix condition check to properly enable interrupts.
> - Fix zxdh_recv_single_pkts() not compacting rcv_pkts[] on failure,
> which could cause use-after-free and mbuf leak.
> - Fix tx_bunch() and tx1() missing store barrier before setting AVAIL flag,
> preventing data race on weakly-ordered architectures.
> - Fix submit_to_backend_simple() writing descriptors for packets that
> failed pkt_padding(), causing mbuf leak.
> v4:
> - fix some AI review issues.
> - fix queue enable intr bug.
> v3:
> - remove unnecessary NULL check in zxdh_init_queue.
> - Split Ring: Bit[31] is unused and reserved, zxdh_queue_notify(): removing the
> zxdh_pci_with_feature(hw, ZXDH_F_RING_PACKED) check;
> - remove unnecessary double-free in in zxdh_recv_single_pkts();
> - used rte_pktmbuf_mtod();
> - remove rxq_get_vq(q) macro, use q->vq and apply it consistently;
> - Refactoring scatter and mtu check logic in zxdh_dev_mtu_set();
> - set txdp->id = avail_idx + i in tx_bunch/tx1.
> - add comment documenting zxdh_xmit_enqueue_append() now sets dxp->cookie = NULL for
> the head slot and stores cookies per descriptor via dep[idx].cookie.
> - add one-line comment noting tx_bunch() is the simple path handles single-segment.
> - remove unnecessary Extra initialization and the uint32_t cast.
> v2:
> - zxdh_rxtx.c, pkt_padding(): modifyed the return value of pkt_padding();
> - zxdh_rxtx.c, zxdh_recv_single_pkts(): modifyed When zxdh_init_mbuf() fails
> the loop does "continue" and free mbufs;
> - zxdh_rxtx.c, refill_desc_unwrap(): Add rte_io_wmb() before writing flags
> in the refill_que_descs();
> - zxdh_queue.h, zxdh_queue_enable_intr(): Remove unnecessary function of zxdh_queue_enable_intr;
> - zxdh_ethdev.c, zxdh_init_queue(): changed the hdr_mz NULL check logic;
> - zxdh_rxtx.c, zxdh_xmit_pkts_simple()、zxdh_recv_single_pkts(): add stats.bytes count;
> - zxdh_rxtx.c, zxdh_init_mbuf():remove rte_pktmbuf_dump(stdout, rxm, 40);
> - zxdh_ethdev.c, zxdh_dev_free_mbufs(): using rte_pktmbuf_free() to free mbufs;
> - Splitting into separate patches, structure reorganization and sw_ring removal、
> RX recv optimize、Tx xmit optimize、Tx;
> v1:
> This patch optimizes the ZXDH PMD's receive and transmit path for better
> performance through several improvements:
> - Add simple TX/RX burst functions (zxdh_xmit_pkts_simple and
> zxdh_recv_single_pkts) for single-segment packet scenarios.
> - Remove RX software ring (sw_ring) to reduce memory allocation and
> copy.
> - Optimize descriptor management with prefetching and simplified
> cleanup.
> - Reorganize structure fields for better cache locality.
> These changes reduce CPU cycles and memory bandwidth consumption,
> resulting in improved packet processing throughput.
>
> Junlong Wang (4):
> net/zxdh: fix queue enable intr issues
> net/zxdh: optimize queue structure to improve performance
> net/zxdh: optimize Rx recv pkts performance
> net/zxdh: optimize Tx xmit pkts performance
>
> drivers/net/zxdh/zxdh_ethdev.c | 83 +++--
> drivers/net/zxdh/zxdh_ethdev_ops.c | 23 +-
> drivers/net/zxdh/zxdh_ethdev_ops.h | 4 +
> drivers/net/zxdh/zxdh_pci.c | 2 +-
> drivers/net/zxdh/zxdh_queue.c | 11 +-
> drivers/net/zxdh/zxdh_queue.h | 122 +++---
> drivers/net/zxdh/zxdh_rxtx.c | 571 ++++++++++++++++++++++-------
> drivers/net/zxdh/zxdh_rxtx.h | 29 +-
> 8 files changed, 584 insertions(+), 261 deletions(-)
>
Better but AI review still found some issues.
Series review: net/zxdh Rx/Tx optimization (v7)
Patches 1-3 are unchanged from v6 except for the Tx prepare split
below; patch 4 still carries the unguarded in-place prepend. The v6
out-of-bounds write is narrowed but not closed.
The improvement: tx_pkt_prepare is now split, and the simple-path
variant zxdh_xmit_pkts_simple_prepare() rejects a packet whose
headroom is too small (data_off < ZXDH_DL_NET_HDR_SIZE) with a clean
error and an invalid_hdr_len_err counter. For applications that call
rte_eth_tx_prepare() this turns the corruption into a reported error.
[PATCH v7 4/4] net/zxdh: optimize Tx xmit pkts performance
Error: the headroom check lives only in tx_pkt_prepare, which is
optional, so the simple Tx burst can still reach the unchecked prepend
in pkt_padding() and write out of bounds.
rte_eth_tx_burst() does not call rte_eth_tx_prepare(); the application
invokes prepare itself, and is allowed to skip it. When MULTI_SEGS is
disabled the burst is zxdh_xmit_pkts_simple() -> submit_to_backend_simple()
-> pkt_padding(), and pkt_padding() still does:
hdr = rte_pktmbuf_mtod_offset(cookie, struct zxdh_net_hdr_dl *, -hdr_len);
rte_memcpy(hdr, net_hdr_dl, hdr_len);
cookie->data_off -= hdr_len;
with no data_off >= hdr_len guard. An application that disables
MULTI_SEGS, consumes most of the mbuf headroom before Tx (tunnel/MPLS
encap, etc.), and calls tx_burst without tx_prepare will underflow
data_off and scribble in front of buf_addr. That is a supported calling
sequence, so the memory-safety precondition cannot rest on the optional
prepare step.
The driver's own packed burst does not depend on prepare for this: in
zxdh_xmit_pkts_packed() the can_push test gates the in-place prepend on
txm->data_off >= ZXDH_DL_NET_HDR_SIZE
inline, and falls back to zxdh_xmit_enqueue_append() (header copied into
the reserved txr region) otherwise. The simple burst should be equally
self-contained.
Make the simple burst safe on its own: check data_off in the datapath
and stop at the first packet that does not fit, returning the count
already enqueued (the same break-and-return the prepare function uses),
so the caller retains ownership of the rejected packet. The
zxdh_xmit_pkts_simple_prepare() check can stay as an early, friendlier
diagnostic, but it cannot be the only guard.
Also still missing: the build-time backstop discussed earlier,
static_assert(RTE_PKTMBUF_HEADROOM >= ZXDH_DL_NET_HDR_SIZE,
"RTE_PKTMBUF_HEADROOM too small for zxdh Tx downlink header");
It does not replace the runtime check (per-packet headroom can be short
on a correctly configured build) but it cheaply rejects a build whose
default headroom cannot hold the header.
More information about the dev
mailing list