[dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link update

Zhao1, Wei wei.zhao1 at intel.com
Fri Oct 12 11:19:25 CEST 2018


Hi, 

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> Sent: Thursday, October 11, 2018 6:27 PM
> To: Zhao1, Wei <wei.zhao1 at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
> Laurent Hardy <laurent.hardy at 6wind.com>
> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; stable at dpdk.org; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link
> update
> 
> On 11.10.2018 12:56, Zhao1, Wei wrote:
> > Hi,  Ilya Maximets AND laurent.hardy
> 
> Hi, thanks for sharing your thoughts.
> Comments inline.
> 
> >
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya Maximets
> >> Sent: Wednesday, September 12, 2018 4:05 PM
> >> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
> >> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >> <konstantin.ananyev at intel.com>; Laurent Hardy
> >> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
> >> stable at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >> fiber link update
> >>
> >> On 12.09.2018 09:49, Zhang, Qi Z wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> >>>> Sent: Monday, September 10, 2018 11:09 PM
> >>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
> >>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >>>> <konstantin.ananyev at intel.com>; Laurent Hardy
> >>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
> >>>> stable at dpdk.org
> >>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >>>> fiber link update
> >>>>
> >>>> On 04.09.2018 09:08, Zhang, Qi Z wrote:
> >>>>> Hi Ilya:
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya
> >>>>>> Maximets
> >>>>>> Sent: Friday, August 31, 2018 8:40 PM
> >>>>>> To: dev at dpdk.org
> >>>>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >>>>>> <konstantin.ananyev at intel.com>; Laurent Hardy
> >>>>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>; Ilya
> >>>>>> Maximets <i.maximets at samsung.com>; stable at dpdk.org
> >>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >>>>>> fiber link update
> >>>>>>
> >>>>>> If the multispeed fiber link is in DOWN state, ixgbe_setup_link
> >>>>>> could take around a second of busy polling. This is highly
> >>>>>> inconvenient for the case where single thread periodically checks
> >>>>>> the
> >> link statuses.
> >>>>>> For example, OVS main thread periodically updates the link
> >>>>>> statuses and hangs for a really long time busy waiting on
> >>>>>> ixgbe_setup_link() for a DOWN fiber ports. For case with 3 down
> >>>>>> ports it hangs for a 3 seconds and unable to do anything including
> packet processing.
> >>>>>> Fix that by shifting that workaround to a separate thread by
> >>>>>> alarm handler that will try to set up link if it is DOWN.
> >>>>>
> >>>>> Does that mean we will block the interrupt thread for 3 seconds?
> >>>>
> >>>> Three times for one second. Other work could be scheduled between.
> >>>> IMHO, it's much better than blocking usual caller for 3 seconds.
> >>>>
> >>>>> Also, can we guarantee there will not be any race condition if we
> >>>>> call
> >>>> ixgbe_setup_link at another thread, the base code API is not
> >>>> assumed to be thread-safe as I know.
> >>>>
> >>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but it
> >>>> could be called only if device stopped. 'ixgbe_dev_stop' cancels the
> alarm.
> >>>> Race with 'link_update' avoided by 'IXGBE_FLAG_NEED_LINK_CONFIG'
> >> flag.
> >>>
> >>> I guess, it' not only about when ixgb_setup_link race with itself,
> >>> but also
> >> when it race with other APIs.
> >>> Also the concern is, even in current version, we can prove there is
> >>> no issue,
> >> how can we guarantee we are safe for future base code update? It's
> >> not designed as thread-safe.
> >>> For my option, the change is risky.
> >>
> >> In current implementation interrupt handler already calls the
> >> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link'
> >> in our case if LSC interrupts enabled. So, my change makes the driver
> >> even safer by moving 'ixgbe_setup_link' to the same interrupt thread.
> >> Otherwise two threads (interrupts handler and the link status
> >> checking
> >> thread) could call 'ixgbe_setup_link' simultaneously.
> >>
> >>>
> >>> Btw, since ixgbe support LSC, it is not necessary for "single thread
> >> periodically checks the link statuses", right?
> >>
> >> In current implementation it will take at least 5 seconds (4 + 1) for
> >> the interrupt handler to detect DOWN link state for ixgbe multispeed
> >> fiber. This is too much for many real world cases.
> >
> > I have reviewed  this patch, now I agree with you of the point that
> > when port is down, " main thread periodically updates the link statuses and
> hangs for a really long time busy waiting on ixgbe_setup_link() for a DOWN
> fiber ports ".
> > This is introduced by a patch in the following:
> > SHA-1: c12d22f65b132c56db7b4fdbfd5ddce27d1e9572
> > * net/ixgbe: ensure link status is updated
> >
> > Because in this patch, ixgbe_setup_link() is called with input parameter
> autoneg_wait_to_complete=1, this will cause loop check and sleep delay.
> > At least 82599 seems has this delay.(BTW, whivh type of NIC are you
> > use? X550 or 82599)
> 
> I have 82599.
> 
> > Your solution is add a eal_alarm_set for ixgbe_setup_link in the thread of
> PMD driver, and do the set up work in that thread, is that right?
> > And main thread avoid hang by the flag of
> IXGBE_FLAG_NEED_LINK_CONFIG.
> > I think this is a good idea for this problem, but it may cause problem
> > for other legacy user of ixgbe pmd, because their legacy code, which use
> main thread  to check link state and setup_link when port is down, and they
> are not aware of it is done by other thread if add your patch.
> 
> What are these applications? I see no public API for setup_link function.
> It's internal to driver and should not be used externally.
> Am I missing something?

rte_eth_link_get() ,  it will also call the function of ixgbe_setup_link().


> 
> >
> > And is that ok if we change code in ixgbe_dev_link_update_share() to
> >
> > ixgbe_dev_link_update_share()
> > {
> >
> > 	/* check if it needs to wait to complete, if lsc interrupt is enabled */
> > 	if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0)
> > 		wait = 0;
> >
> > 	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
> > 		ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> > 		speed = hw->phy.autoneg_advertised;
> > 		if (!speed)
> > 			ixgbe_get_link_capabilities(hw, &speed, &autoneg);
> > 		ixgbe_setup_link(hw, speed, wait);
> > 	}
> > }
> >
> > Then, your application can call rte_eth_link_get_nowait () to make
> > wait_to_complete=0 when doing periodic link state check, Which will not
> cause  loop check and sleep delay. Legacy code of other user call
> rte_eth_link_get()  will not be affected also.
> > But, I am NOT confident ,whether this will introduce new problem when
> set up link without wait!
> > So, this is just a discussion topic.
> 
> Unfortunately this will not help. Take a look to the function
> 'ixgbe_setup_mac_link_multispeed_fiber()', which is the main problematic
> function here. 'wait_to_complete' here used only as argument for
> ixgbe_setup_mac_link(), and the busy waiting loops are outside of it.
> Regardless of the 'wait_to_complete' value, this function will busy poll the
> link for 1040 ms trying to setup 10GB speed and 140ms more trying to setup
> 1GB speed. After that, it will call itself recursively and wait again... Looks like I
> miscalculated last time. Right now it'll be more than 2 seconds for each down
> port since following patch merged:
> 8fc1f32fa615 ("net/ixgbe: wait longer for link after fiber MAC setup").

Yes, you are right, link state check loop in function ixgbe_setup_mac_link_multispeed_fiber() are not blocked by bool autoneg_wait_to_complete,
It will cause about 1s wait when port is down.
And also, can we update code in function ixgbe_setup_mac_link_multispeed_fiber() to  block link state check loop using autoneg_wait_to_complete?
I am not sure. Because there is a comment for this loop check:
		/*
		 * Wait for the controller to acquire link.  Per IEEE 802.3ap,
		 * Section 73.10.2, we may have to wait up to 500ms if KR is
		 * attempted.  82599 uses the same timing for 10g SFI.
		 */
It seems we have to wait for at least 500ms for spec requirement before we check link after configuration.
If that is true, we can not do any change to these loop check.
But why not main thread take some action to stop periodic link sate check when it find it has been hang or link is down?
 


> 
> >
> > Hi, laurent.hardy
> >  You are the author for the patch (* net/ixgbe: ensure link status is
> updated), why do you implement code that way?
> > Is that must that  set up link with wait?
> >
> > ixgbe_setup_link(hw, speed, true);
> >
> >
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>> Regards
> >>>>> Qi
> >>>>>
> >>>>>>
> >>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated")
> >>>>>> CC: stable at dpdk.org
> >>>>>>
> >>>>>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
> >>>>>> ---
> >>>>>>  drivers/net/ixgbe/ixgbe_ethdev.c | 43
> >>>>>> ++++++++++++++++++++++++--------
> >>>>>>  1 file changed, 32 insertions(+), 11 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>> b/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>> index 26b192737..a33b9a6e8 100644
> >>>>>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>> @@ -221,6 +221,8 @@ static int ixgbe_dev_interrupt_action(struct
> >>>>>> rte_eth_dev *dev,
> >>>>>>  				      struct rte_intr_handle *handle);
> static
> >> void
> >>>>>> ixgbe_dev_interrupt_handler(void *param);  static void
> >>>>>> ixgbe_dev_interrupt_delayed_handler(void *param);
> >>>>>> +static void ixgbe_dev_setup_link_alarm_handler(void *param);
> >>>>>> +
> >>>>>>  static int ixgbe_add_rar(struct rte_eth_dev *dev, struct
> >>>>>> ether_addr *mac_addr,
> >>>>>>  			 uint32_t index, uint32_t pool);  static void
> >>>>>> ixgbe_remove_rar(struct rte_eth_dev *dev, uint32_t index); @@
> >>>>>> -2791,6 +2793,8 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
> >>>>>>
> >>>>>>  	PMD_INIT_FUNC_TRACE();
> >>>>>>
> >>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler,
> dev);
> >>>>>> +
> >>>>>>  	/* disable interrupts */
> >>>>>>  	ixgbe_disable_intr(hw);
> >>>>>>
> >>>>>> @@ -3969,6 +3973,25 @@ ixgbevf_check_link(struct ixgbe_hw *hw,
> >>>>>> ixgbe_link_speed *speed,
> >>>>>>  	return ret_val;
> >>>>>>  }
> >>>>>>
> >>>>>> +static void
> >>>>>> +ixgbe_dev_setup_link_alarm_handler(void *param) {
> >>>>>> +	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
> >>>>>> +	struct ixgbe_hw *hw =
> >>>>>> IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>>>>> +	struct ixgbe_interrupt *intr =
> >>>>>> +		IXGBE_DEV_PRIVATE_TO_INTR(dev->data-
> >dev_private);
> >>>>>> +	u32 speed;
> >>>>>> +	bool autoneg = false;
> >>>>>> +
> >>>>>> +	speed = hw->phy.autoneg_advertised;
> >>>>>> +	if (!speed)
> >>>>>> +		ixgbe_get_link_capabilities(hw, &speed, &autoneg);
> >>>>>> +
> >>>>>> +	ixgbe_setup_link(hw, speed, true);
> >>>>>> +
> >>>>>> +	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; }
> >>>>>> +
> >>>>>>  /* return 0 means link status changed, -1 means not changed */
> >>>>>> int ixgbe_dev_link_update_share(struct rte_eth_dev *dev, @@ -
> >> 3981,9
> >>>>>> +4004,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> *dev,
> >>>>>>  		IXGBE_DEV_PRIVATE_TO_INTR(dev->data-
> >dev_private);
> >>>>>>  	int link_up;
> >>>>>>  	int diag;
> >>>>>> -	u32 speed = 0;
> >>>>>>  	int wait = 1;
> >>>>>> -	bool autoneg = false;
> >>>>>>
> >>>>>>  	memset(&link, 0, sizeof(link));
> >>>>>>  	link.link_status = ETH_LINK_DOWN; @@ -3993,13 +4014,8
> @@
> >>>>>> ixgbe_dev_link_update_share(struct
> >>>> rte_eth_dev
> >>>>>> *dev,
> >>>>>>
> >>>>>>  	hw->mac.get_link_status = true;
> >>>>>>
> >>>>>> -	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
> >>>>>> -		ixgbe_get_media_type(hw) ==
> ixgbe_media_type_fiber) {
> >>>>>> -		speed = hw->phy.autoneg_advertised;
> >>>>>> -		if (!speed)
> >>>>>> -			ixgbe_get_link_capabilities(hw, &speed,
> &autoneg);
> >>>>>> -		ixgbe_setup_link(hw, speed, true);
> >>>>>> -	}
> >>>>>> +	if (intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG)
> >>>>>> +		return rte_eth_linkstatus_set(dev, &link);
> >>>>>>
> >>>>>>  	/* check if it needs to wait to complete, if lsc interrupt is
> enabled */
> >>>>>>  	if (wait_to_complete == 0 || dev->data-
> >dev_conf.intr_conf.lsc
> >>>>>> !=
> >>>>>> 0) @@
> >>>>>> -4017,11 +4033,14 @@ ixgbe_dev_link_update_share(struct
> >> rte_eth_dev
> >>>> *dev,
> >>>>>>  	}
> >>>>>>
> >>>>>>  	if (link_up == 0) {
> >>>>>> -		intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>> +		if (ixgbe_get_media_type(hw) ==
> ixgbe_media_type_fiber)
> >> {
> >>>>>> +			intr->flags |=
> IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>> +			rte_eal_alarm_set(10,
> >>>>>> +
> 	ixgbe_dev_setup_link_alarm_handler, dev);
> >>>>>> +		}
> >>>>>>  		return rte_eth_linkstatus_set(dev, &link);
> >>>>>>  	}
> >>>>>>
> >>>>>> -	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>>  	link.link_status = ETH_LINK_UP;
> >>>>>>  	link.link_duplex = ETH_LINK_FULL_DUPLEX;
> >>>>>>
> >>>>>> @@ -5128,6 +5147,8 @@ ixgbevf_dev_stop(struct rte_eth_dev
> *dev)
> >>>>>>
> >>>>>>  	PMD_INIT_FUNC_TRACE();
> >>>>>>
> >>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler,
> dev);
> >>>>>> +
> >>>>>>  	ixgbevf_intr_disable(dev);
> >>>>>>
> >>>>>>  	hw->adapter_stopped = 1;
> >>>>>> --
> >>>>>> 2.17.1
> >>>>>


More information about the dev mailing list