[dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of associated pci device

Matan Azrad matan at nvidia.com
Tue Sep 15 09:00:03 CEST 2020


Hi Li

From: Long Li <longli at microsoft.com>
> >Subject: Re: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of
> >associated pci device
> >
> >Hi Stephen
> >
> >From: Stephen Hemminger:
> >> On Sun, 6 Sep 2020 12:38:18 +0000
> >> Matan Azrad <matan at nvidia.com> wrote:
> >>
> >> > Hi Stephen
> >> >
> >> > From: Stephen Hemminger:
> >> > > The vdev_netvsc was not detecting when the associated PCI device
> >> > > (SRIOV) was removed. Because of that it would keep feeding the
> >> > > same
> >> > > (removed) device to failsafe PMD which would then unsuccessfully
> >> > > try and probe for it.
> >> > >
> >> > > Change to use a mark/sweep method to detect that PCI device was
> >> > > removed, and also only tell failsafe about new PCI devices.
> >> > > Vdev_netvsc does not have to keep stuffing the pipe with the same
> >> > > already existing PCI device.
> >> >
> >> > As I know, the vdev_netvsc driver doesn't call to failsafe if the
> >> > PCI device is
> >> not detected by the readlink command(considered as removed)...
> >> > Am I missing something?
> >>
> >> The original code is broken because ctx_yield is not cleared, it
> >> keeps sending the same value.
> >
> >Looking on the code again, It looks like ctx->yield has no effect on
> >the next pipe write, It is just used for log.
> >
> >After the PCI interface matching to the netvsc interface, the pipe
> >write is triggered only if the readlink commands success to see the
> >plugged-in PCI
> >device:
> >readlink /sys/class/net/[iface]/device/subsystem shows "pci"
> >readlink /sys/class/net/[iface]/device shows the pci device ID.
> >
> >So, the assumption is when the above readlink failed on the interface
> >the device is removed(plugged-out) and the fd write will not happen.
> >
> >The code will continue to retry probe again and again until success
> >only for plugged-in pci device matched the netvsc device.
> 
> Hi Matan,
> 
> The original code keeps writing to pipe even it's the same PCI device.

Yes, the vdev_netvsc writes any plugged-in device to the associated netvsc device fd.

> The
> new code writes to pipe for a new device, only once. See the following code:
> 
> +	/* Skip if this is same device already sent to failsafe */
> +	if (strcmp(addr, ctx->yield) == 0)
> +		return 0;
> 

I understand you want to optimize the pipe writing to be written only after plugged-in hot event.

The current solution suffers from race: the PCI device may be plugged-out and plugged-in in short time shorter than the driver alarm delay, then the PCI device plugged-in detection will lost.

My suggestion:
Add validation to the plugged-in device probing state and that it is owned by failsafe(using ownership API) - don't write the pipe if so.

Matan



> This patch also saves lots of CPU since it no longer writes to pipe all the time.
> You are correct about the code will continue to probe on a new PCI device.
> But someone has to do it to handle hot-add.
> 
> Thanks,
> Long
> 
> 
> >
> >> It looks like device removal and add was never tested.
> >
> >This is basic test we have to test plug-in plug-out and it passed every
> >day in the last years.
> >
> >Maybe something new and special in your setup?
> >
> >> If you test removal you will see that vdev_netvsc:
> >>  1. Sends same PCI device repeatedly to failsafe (every alarm call)
> >>     This is harmless, but useless.
> >>  2. When device is removed, keeps doing #1


More information about the dev mailing list