[PATCH 8/9] ethdev: keep fast-path ops valid after port stop

Maxime Leroy maxime at leroys.fr
Thu Jun 11 20:39:12 CEST 2026


Le jeu. 11 juin 2026, 18:01, Morten Brørup <mb at smartsharesystems.com> a
écrit :

> > From: Maxime Leroy [mailto:maxime.leroys at gmail.com] On Behalf Of Maxime
> > Leroy
> > Sent: Thursday, 11 June 2026 17.49
> >
> > eth_dev_fp_ops_reset() restores a port's fast-path ops on stop/release
> > via a compound literal, so every field it omits is zeroed to NULL. It
> > sets only rx_pkt_burst/tx_pkt_burst (and the rxq/txq data), leaving
> > rx_queue_count, tx_queue_count, rx/tx_descriptor_status, tx_pkt_prepare
> > and the recycle callbacks NULL.
> >
> > In non-debug builds these ops are reached through an unguarded indirect
> > call (the NULL check exists only under RTE_ETHDEV_DEBUG_RX/TX). So a
> > thread calling e.g. rte_eth_rx_queue_count() on a port being stopped
> > dereferences NULL and crashes, while the same race on
> > rte_eth_rx_burst()
> > is harmless because the burst ops are reset to dummies. A poll-mode
> > worker re-checking rx_queue_count before arming the Rx interrupt and
> > sleeping hits exactly this.
> >
> > Reset these ops to the same dummies eth_dev_set_dummy_fops() installs,
> > so a stopped port behaves like a freshly allocated one: every fast-path
> > op is a safe no-op, none is NULL.
> >
> > Fixes: 066f3d9cc21c ("ethdev: remove callback checks from fast path")
> > Cc: stable at dpdk.org
> > Signed-off-by: Maxime Leroy <maxime at leroys.fr>
> > ---
>
> Good catch.
> Acked-by: Morten Brørup <mb at smartsharesystems.com>
>
> Not related to the series, consider sending as separate patch.
>
Thanks for the review and Ack.

Agreed, this is a generic ethdev fix. I kept it in this series because the
NAPI user depends on it.

The current Grout NAPI loop arms RX queue interrupts and then re-checks
rte_eth_rx_queue_count() before blocking, to avoid sleeping when a packet
arrived between the last empty poll and epoll_wait.

With the current ethdev reset path, rx_burst is replaced by a dummy
callback on stop/release, but rx_queue_count becomes NULL. So if the port
is stopped concurrently, the NAPI worker dereferences a NULL function
pointer and
segfaults on that recheck.

I can split it out if maintainers prefer, but then the dpaa2 NAPI series
has a real dependency on the standalone ethdev fix.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20260611/2a032f6d/attachment.htm>


More information about the dev mailing list