[dpdk-dev] [PATCH v3] net/failsafe: fix calling device during RMV events

Gaëtan Rivet gaetan.rivet at 6wind.com
Mon Oct 23 10:36:12 CEST 2017


On Mon, Oct 23, 2017 at 07:17:41AM +0000, Ophir Munk wrote:
> Hi Gaetan,
> Thanks for your quick reply. Please see comments inline.
> 
> Regards,
> Ophir
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet at 6wind.com]
> > Sent: Friday, October 20, 2017 1:35 PM
> > To: Ophir Munk <ophirmu at mellanox.com>
> > Cc: dev at dpdk.org; Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > <olgas at mellanox.com>; stable at dpdk.org
> > Subject: Re: [PATCH v3] net/failsafe: fix calling device during RMV events
> > 
> > Hi Ophir,
> > 
> > Sorry about the delay,
> > I have a few remarks, I think this patch could be simpler.
> > 
> > First, about the commit logline:
> > "calling device" is not descriptive enough. I'd suggest
> > 
> >     net/failsafe: fix device configuration during RMV events
> > 
> > But I'm not a native speaker either, so use it if you think it is better, or don't,
> > it's only a suggestion :).
> > 
> > On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote:
> > > This commit prevents control path operations from failing after a sub
> > > device removal.
> > >
> > > Following are the failure steps:
> > > 1. The physical device is removed due to change in one of PF
> > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3.
> > > Within 2 seconds Interrupt thread initializes the actual device
> > > removal, then every 2 seconds it tries to re-sync (plug in) the
> > > device. The trials fail as long as VF parameter mismatches the PF
> > parameter.
> > > 4. A control thread initiates a control operation on failsafe which
> > > initiates this operation on the device.
> > > 5. A race condition occurs between the control thread and interrupt
> > > thread when accessing the device data structures.
> > >
> > > This commit prevents the race condition in step 5. Before this commit
> > > if a device was removed and then a control thread operation was
> > > initiated on failsafe - in some cases failsafe called the sub device
> > > operation instead of avoiding it. Such cases could lead to operations
> > failures.
> > >
> > 
> > This is a nitpick, but as said earlier, this is not preventing the race condition.
> > This race is still present and can still wreak havok on unsuspecting users.
> > 
> > If an application has a weak threading model, it will be subject to this race
> > condition still. It is possible to prevent it fully with proper care from the
> > application standpoint, but this is not specific to fail-safe and does not
> > concern us here.
> > 
> > Anyway, it's really a nitpick, I just wanted to point it out. This is not too
> > important for this patch.
> > 
> > > This commit fixes failsafe criteria to determine when the device is
> > > removed such that it will avoid calling the sub device operations
> > > during that time and will only call them otherwise.
> > >
> > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
> > > Cc: stable at dpdk.org
> > >
> > > Signed-off-by: Ophir Munk <ophirmu at mellanox.com>
> > > ---
> > > v3:
> > > 1. Rebase v2
> > >
> > > 2. Please ignore checkpatch checks on arguments re-usage - they are
> > confirmed.
> > > 	CHECK:MACRO_ARG_REUSE: Macro argument reuse ... possible side-
> > effects?
> > > 	#217: FILE: drivers/net/failsafe/failsafe_private.h:241:
> > >
> > > 3. Add rationales (copy from an email which accompanied v2):
> > >
> > > On Monday, September 11, 2017 11:31 AM, Gaetan Rivet wrote:
> > > >
> > > > Hi Ophir,
> > > >
> > > > On Sat, Sep 09, 2017 at 07:27:11PM +0000, Ophir Munk wrote:
> > > > > This commit prevents control path operations from failing after a
> > > > > sub device has informed failsafe it has been removed.
> > > > >
> > > > > Before this commit if a device was removed and then a control path
> > > >
> > > > Here are the steps if I understood correctly:
> > > >
> > > > 0. The physical device is removed
> > > > 1. The interrupt thread flags the device 2. A control lcore
> > > > initiates a control operation 3. The alarm triggers, waking up the eal-intr-
> > thread,
> > > >    initiating the actual device removal.
> > > > 4. Race condition occurs between control lcore and interrupt thread.
> > > >
> > > > "if a device was removed" is ambiguous I think (are we speaking
> > > > about the physical port? Is it only flagged? Is it after the removal of the
> > device itself?).
> > > > From the context I gather that you mean the device is flagged to be
> > > > removed, but it won't be as clear in a few month when we revisit this bug
> > :) .
> > > >
> > > > Could you please rephrase this so that the whole context of the
> > > > issue is available?
> > > >
> > >
> > > Done. Commit message was rephrased based on your comments
> > >
> > > > > operations was initiated on failsafe - in some cases failsafe
> > > > > called the sub device operation instead of avoiding it. Such cases
> > > > > could lead to operations failures.
> > > > >
> > > > > This commit fixes failsafe criteria to determine when the device
> > > > > is removed such that it will avoid calling the sub device
> > > > > operations during that time and will only call them otherwise.
> > > > >
> > > >
> > > > This commit mitigates the race condition, reducing the probability
> > > > for it to have an effect. It does not, however, remove this race
> > > > condition, which is inherent to the DPDK architecture at the moment.
> > > >
> > > > A proper fix, a more detailed workaround and additional
> > > > documentation warning users writing applications to mind their threads
> > could be interesting.
> > > >
> > >
> > > The race condition occurs in the last step and may lead to
> > > segmentation faults (accessing data structures of the same device by 2
> > > threads) The previous steps ("the physical device is removed", etc) were not
> > recreated and tested but probably cannot lead to segmentation fault.
> > >
> > > > But let's focus on this patch for the time being.
> > > >
> > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
> > > > > Cc: stable at dpdk.org
> > > > >
> > > > > Signed-off-by: Ophir Munk <ophirmu at mellanox.com>
> > > > > ---
> > > > >  drivers/net/failsafe/failsafe_ether.c |  1 +
> > > > >  drivers/net/failsafe/failsafe_ops.c   | 52
> > > > +++++++++++++++++++++++++++++------
> > > > >  2 files changed, 45 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c
> > > > > b/drivers/net/failsafe/failsafe_ether.c
> > > > > index a3a8cce..1def110 100644
> > > > > --- a/drivers/net/failsafe/failsafe_ether.c
> > > > > +++ b/drivers/net/failsafe/failsafe_ether.c
> > > > > @@ -378,6 +378,7 @@
> > > >
> > > > Could you please generate your patches with the function name in the
> > diff?
> > >
> > > Done
> > >
> > > >
> > > > >  				      i);
> > > > >  				goto err_remove;
> > > > >  			}
> > > > > +			sdev->remove = 0;
> > > >
> > > > You are adding this here, within failsafe_eth_dev_state_sync, and
> > > > removing it from the dev_configure ops.
> > > >
> > > > 10 lines above, the call to dev_configure is done, meaning that the
> > > > remove flag was resetted at this point.
> > > >
> > > > Can you explain why you prefer resetting the flag here?
> > > >
> > > > The position of this flag reset will be dependent upon my subsequent
> > > > remarks anyway, so hold that thought :) .
> > > >
> > >
> > > The motivation for resetting the "remove" flag within
> > failsafe_eth_dev_state_sync is as follows:
> > > Previously to this patch the "remove" flag was designed to signal the need
> > to remove the sub device.
> > > Once the sub device was removed and before being reconfigured the
> > "remove" flag was reset.
> > >
> > > After this patch the scope of the "remove" flag was *extended* to
> > > indicate the sub device status as being "plugged out" by resetting this flag
> > only after a successful call to failsafe_eth_dev_state_sync().
> > > The "plug out" status could last a very long time (seconds, minutes, days,
> > weeks, ...).
> > >
> > > Previously to this patch failsafe based the "plugged out" status on
> > > the sub device state as being below ACTIVE however every 2 seconds
> > > dev_configure() was called where the sub device was assigned sdev-
> > > >state = DEV_ACTIVE; therefore the sub device state became ACTIVE for
> > some time every 2 seconds.
> > > This is where the race condition occurred: failsafe considered the sub
> > > device as "Plugged in" for some time every 2 seconds (based on its ACTIVE
> > state) while it was actually plugged out.
> > >
> > > After this patch the "Plugged out" status is based on the "remove" flag.
> > >
> > 
> > Sorry, I do not agree with this semantical change on the "remove" flag.
> > You are essentially adding a new device state, which could be fine per se, but
> > should not be done here.
> > 
> > The enum dev_state is there for this purpose.
> > 
> > The flag dev->remove, calls for an operation to be done upon the concerned
> > device. It is not meant to become a new device state.
> > 
> > A point about the work methodoly here: if you wanted to change this
> > semantic, which could be legitimate and sometimes called for, you should
> > have proposed it either during a discussion in a response to my previous
> > email, or introducing the change as a separate patch. This point is important
> > enough for it to have its own patch, meaning we would have a whole thread
> > dedicated to it instead of having to interleave commentaries between
> > related-but-separate diffs on the code.
> > 
> > But anyway, if you think you need to express a PLUGOUT state, I'd suggest
> > adding a state between DEV_UNDEFINED and DEV_PARSED.
> > DEV_UNDEFINED means that the device is in limbo and has no existence per
> > se (its parsing failed for example, it is not clear whether the parameters are
> > correct, etc...). DEV_PLUGOUT could mean then that the device has been
> > successfully probed at least once, meaning that it could possibly have
> > residuals from this probing still there, or specific care to be taken when
> > manipulating it.
> > 
> > However, I'm not yet convinced that this new state is necessary. I think you
> > can mitigate this race condition without having to add it. If you insist in
> > introducing this state, please do so in a separate patch, with proper
> > definition about the meaning of this state:
> > 
> >   + When it should be valid for a device to be in this state.
> >   + Which operation corresponds to getting into and out of this state.
> >   + Why this state is interesting and what could not be expressed before
> >     that is thus being fixed by introducing this state.
> > 
> > But please verify twice whether you absolutely need to complexify the
> > current fail-safe internals before going all in and basing your work upon it :)
> > 
> 
> Indeed what I am currently missing in failsafe is knowing the device is in a PLUGOUT state.
> Your suggestion to add a new state DEV_PLUGOUT cannot be used with the current implementation
> as the device states are modified during an alarm hotplug handling every 2 seconds.
> In fs_hotplug_alarm() we call failsafe_eth_dev_state_sync(dev) which eventually calls ->dev_configure(dev) 
> where we assign: sdev->state = DEV_ACTIVE; 
> then when sync fails fs_hotplug_alarm() calls failsafe_dev_remove(dev) which will call fs_dev_remove(sdev); where the sub devices
> states are changed from ACTIVE down to DEV_UNDEFINED.
> Having said that it means that during a PLUGOUT event - the states are modified with each invocation of the fs_hotplug_alarm
> every 2 seconds. So even if we added DEV_PLUGOUT state - it will not remain fixed during the hotplug alarm handling.
> I have also verified all of this with printouts.
> When seeing a sub device in state "DEV_ACTIVE" - we cannot tell whether the device is currently in "PLUGOUT situation" or "PLUGIN situation"
> This allows operations such as fs_mtu_set() on sub-devices which are in "PLUGOUT situation" while their state is 
> DEV_ACTIVE to be manipulated, which I think should have been avoided.
> 
> Please note fs_mtu_set() implementation:
> tatic int fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
> {
>   ..
>         FOREACH_SUBDEV_ACTIVE(sdev, i, dev) {
> 	// ***** We are here while the device can be in a "PLUGOUT situation" ***
> 
> To summarize:
> I am missing a way to know in failsafe that a sub-device is currently plugged out

(sdev->state < DEV_ACTIVE && !sdev->remove) means that the device is
plugged out.

> 1. I suggested extending the "remove" flag scope for this purpose. It has minimal changes with current failsafe implementation. You prefer not using "remove".

I prefer using it, but as a flag, not as a device state.

> 2. You suggested adding a new state DEV_PLUGOUT. I don't think it will work with current implementation (as explained above) or may require a redesign of current implementation.
> 

I do not suggest adding a new state DEV_PLUGOUT.
I suggest using sdev->remove properly.

> Can you suggest another way?
> 

0. In a separate commit, move the
      sdev->remove = 0;
   from fs_dev_configure, into the case DEV_UNDEFINED of the switch
   within fs_dev_remove. This is cleaner and more logical anyway.

1. Check that sdev->remove is not set in fs_find_next.
   If sdev->remove is set, then the device should be skipped.

2. In failsafe_dev_remove, do not use the FOREACH_SUBDEV_STATE iterator,
   but manually iterate over all sub-devices using the subs_tail and
   subs_head values.
   As the generic iterator would skip over devices which have
   sdev->remove set, this function would not work anymore.

3. Find the places I have missed that needs this manual iterator to be
   used instead of the FOREACH_SUBDEV{,_STATE} ones. I think there is at
   least other place that I cannot recall, there might be more.

If you think this does not work, please tell me why, because then it
means that I have misunderstood something about the race condition you
are trying to fix.

-- 
Gaëtan Rivet
6WIND


More information about the dev mailing list