[EXTERNAL] Re: [PATCH v3 0/7] fix multi-process VF hotplug
Long Li
longli at microsoft.com
Thu Feb 26 02:18:46 CET 2026
> This patch is generally in good shape. Using the AGENTS.md and AI review did see
> some things that should be addressed. As always take look at AI review with a
> skeptical mind; some of this it being overly picky...
>
> A few findings from review:
>
> Patch 1/7 (netvsc race conditions):
>
> netvsc_hotplug_retry: hn_vf_add() return value ignored. The newly added call
> at line 675 doesn't check the return. A failed VF attach during hotplug retry would
> be completely silent — at minimum log the error. hn_vf_add_unlocked: goto
> detach after -EEXIST tears down the original attachment. When hn_vf_attach()
> returns -EEXIST (already
> attached) and a subsequent configure/start step fails, hn_vf_detach() unregisters
> the callback and clears vf_attached/vf_port from the original successful
> attachment. This leaves the VF port with an owner but netvsc thinking it's
> detached. Consider skipping hn_vf_detach() when the attach was an -EEXIST
> rather than a fresh attach. Version notes (v2/v3 changelog) are above the --- line
> and will be included in the permanent commit message. Move them below ---.
>
> Patch 2/7 (multi-process VF removal):
>
> eth_hn_remove: MP infrastructure torn down before device cleanup.
> The counter decrement and netvsc_uninit_once() (which frees the shared
> memzone and unregisters MP handlers) happen before the device is closed. If a
> pending hn_remove_delayed alarm fires after the memzone is freed,
> netvsc_mp_req_vf() dereferences netvsc_shared_data (freed
> memory) when checking secondary_cnt. Suggest moving the counter decrement
> and uninit to after device close/release. Secondary init failure leaves
> netvsc_shared_data dangling. If
> netvsc_mp_init_secondary() fails, netvsc_shared_data still points to the
> memzone but init_done is false. Consider clearing it on failure.
> Same version-notes-above---- issue as patch 1.
>
> Patch 3/7 (mana PD leak):
>
> PD deallocation order. ib_parent_pd is freed before ib_pd. If there's a
> parent/child dependency (as the naming suggests), deallocating the parent first
> will fail with EBUSY. Consider reversing the order: dealloc ib_pd first, then
> ib_parent_pd. Same version-notes issue.
This is a false alarm. ib_parent_pd needs to be freed first.
The other comments make sense. I'm making fixes and sending v4.
Long
>
>
> Patches 5-7/7 (fp_ops in secondary):
>
> These look correct and well-placed (before the existing rte_mb()).
>
> Minor note: the plain (non-atomic) pointer stores to rte_eth_fp_ops fields could
> theoretically tear on ARM if a polling thread reads simultaneously. This is
> consistent with the existing burst function pointer pattern so not a regression, but
> worth keeping in mind.
>
> Overall the series is in good shape. Items 2 and 4 are the most important to
> address.
More information about the stable
mailing list