[PATCH v3 0/7] fix multi-process VF hotplug

Stephen Hemminger stephen at networkplumber.org
Wed Feb 25 23:36:16 CET 2026


This patch is generally in good shape. Using the AGENTS.md and AI review
did see some things that should be addressed. As always take look
at AI review with a skeptical mind; some of this it being overly
picky...

 A few findings from review:

Patch 1/7 (netvsc race conditions):

    netvsc_hotplug_retry: hn_vf_add() return value ignored. The newly
added call at line 675 doesn't check the return. A failed VF attach
during hotplug retry would be completely silent — at minimum log the
error. hn_vf_add_unlocked: goto detach after -EEXIST tears down the
original attachment. When hn_vf_attach() returns -EEXIST (already
attached) and a subsequent configure/start step fails, hn_vf_detach()
unregisters the callback and clears vf_attached/vf_port from the
original successful attachment. This leaves the VF port with an owner
but netvsc thinking it's detached. Consider skipping hn_vf_detach()
when the attach was an -EEXIST rather than a fresh attach. Version
notes (v2/v3 changelog) are above the --- line and will be included in
the permanent commit message. Move them below ---.

Patch 2/7 (multi-process VF removal):

    eth_hn_remove: MP infrastructure torn down before device cleanup.
The counter decrement and netvsc_uninit_once() (which frees the shared
memzone and unregisters MP handlers) happen before the device is
closed. If a pending hn_remove_delayed alarm fires after the memzone is
freed, netvsc_mp_req_vf() dereferences netvsc_shared_data (freed
memory) when checking secondary_cnt. Suggest moving the counter
decrement and uninit to after device close/release. Secondary init
failure leaves netvsc_shared_data dangling. If
netvsc_mp_init_secondary() fails, netvsc_shared_data still points to
the memzone but init_done is false. Consider clearing it on failure.
Same version-notes-above---- issue as patch 1.

Patch 3/7 (mana PD leak):

    PD deallocation order. ib_parent_pd is freed before ib_pd. If
there's a parent/child dependency (as the naming suggests),
deallocating the parent first will fail with EBUSY. Consider reversing
the order: dealloc ib_pd first, then ib_parent_pd. Same version-notes
issue.


Patches 5-7/7 (fp_ops in secondary):

These look correct and well-placed (before the existing rte_mb()).

Minor note: the plain (non-atomic) pointer stores to rte_eth_fp_ops
fields could theoretically tear on ARM if a polling thread reads
simultaneously. This is consistent with the existing burst function
pointer pattern so not a regression, but worth keeping in mind.

Overall the series is in good shape. Items 2 and 4 are the most
important to address.


More information about the stable mailing list