[EXTERNAL] Re: [PATCH] net/netvsc: switch data path to synthetic on device stop

Long Li longli at microsoft.com
Tue Mar 24 00:24:20 CET 2026


> Long Li <longli at microsoft.com> wrote:
> 
> > When DPDK stops a netvsc device (e.g. on testpmd quit), the data path
> > was left pointing to the VF/MANA device. If the kernel netvsc driver
> > subsequently reloads the MANA device and opens it, incoming traffic
> > arrives on the MANA device immediately, before the queues are fully
> > initialized. This causes bogus RX completion events to appear on the
> > TX completion queue, triggering a kernel WARNING in mana_poll_tx_cq().
> >
> > Fix this by switching the data path back to synthetic (via
> > NVS_DATAPATH_SYNTHETIC) in hn_vf_stop() before stopping the VF device.
> > This tells the host to route traffic through the synthetic path, so
> > that when the MANA driver recreates its queues, no unexpected traffic
> > arrives until netvsc explicitly switches back to VF.
> >
> > Also update hn_vf_start() to switch the data path back to VF after the
> > VF device is started, enabling correct stop/start cycling.
> >
> > Both functions now use write locks instead of read locks since they
> > modify vf_vsc_switched state.
> >
> > Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Long Li <longli at microsoft.com>
> 
> Looks good to me you might want to address the error condition spotted by AI
> review.
> 
> ---
> 
> **Patch: net/netvsc: switch data path to synthetic on device stop**
> 
> This patch addresses a real race condition where stopping a netvsc device leaves
> the data path pointing to the VF/MANA device, causing kernel warnings when the
> MANA driver later reinitializes. The fix is logically sound — switch to synthetic
> before stopping the VF, and re-switch to VF on start.
> 
> **Error: `hn_vf_stop()` — `vf_vsc_switched` cleared even when
> `hn_nvs_set_datapath()` fails**
> 
> In `hn_vf_stop()`, if `hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC)` fails,
> `vf_vsc_switched` is unconditionally set to `false`. This means the driver believes
> it switched to synthetic when the host may still be routing traffic through the VF.
> On a subsequent `hn_vf_start()`, the `!hv->vf_ctx.vf_vsc_switched` check will
> pass and the driver will try to re-switch to VF — but since the host never left VF
> mode, this is a no-op at best or confusing at worst. More importantly, if stop is
> being called on the path to teardown, the flag is now wrong.
> 
> I note that `hn_vf_remove_unlocked()` has the same pattern (clears the flag
> regardless, with the comment "Clear switched flag regardless — VF is being
> removed"), so this may be intentional for netvsc since on the remove path you
> want to forget the state. But on the stop path the device is still present and will
> be restarted — propagating the error and leaving `vf_vsc_switched = true` might
> be more correct so that `hn_vf_start()` retries the switch. Worth confirming this is
> intentional.
> 
> **Warning: `hn_vf_start()` — error from `hn_nvs_set_datapath()` returned but VF
> device left started**
> 
> In `hn_vf_start()`, if `rte_eth_dev_start()` succeeds but the subsequent
> `hn_nvs_set_datapath(hv, NVS_DATAPATH_VF)` fails, the function logs the error
> and returns the failure code. However, the VF device is left in the started state.
> The caller sees a failure from `hn_vf_start()`, but the VF is running with no traffic
> routed to it. This is a resource consistency issue — if the datapath switch fails,
> should the VF be stopped again to maintain consistent state?

I'm sending v2 to fix this.

> 
> **Warning: `hn_vf_add_unlocked()` — change defers datapath switch but still
> returns 0 on the deferred path**
> 
> The patch modifies `hn_vf_add_unlocked()` to skip the datapath switch when
> `!dev->data->dev_started`. This is correct, but note that in the original code the
> function would return the result of `hn_nvs_set_datapath()` — if that failed, it
> returned an error. Now on the deferred path, `ret` retains whatever value it had
> from the attach/configure path (could be 0 from a successful attach), so the caller
> gets success even though the datapath switch was not attempted. This is fine for
> the hot-add-before-start case, just noting the behavior change.
> 

This warning can be ignored.

Thanks,
Long

> **Info: Lock upgrade from read to write is correct**
> 
> Both `hn_vf_start()` and `hn_vf_stop()` correctly switch from
> `rte_rwlock_read_lock` to `rte_rwlock_write_lock` since they now modify
> `vf_vsc_switched`. This matches the locking pattern used by `hn_vf_close()` and
> `hn_nvs_handle_vfassoc()`.


More information about the dev mailing list