[dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link update

Zhao1, Wei wei.zhao1 at intel.com
Mon Oct 15 05:03:01 CEST 2018


Hi, Ilya Maximets

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> Sent: Friday, October 12, 2018 6:15 PM
> To: Zhao1, Wei <wei.zhao1 at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>;
> Laurent Hardy <laurent.hardy at 6wind.com>
> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; stable at dpdk.org; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link
> update
> 
> On 12.10.2018 12:19, Zhao1, Wei wrote:
> > Hi,
> >
> >> -----Original Message-----
> >> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> >> Sent: Thursday, October 11, 2018 6:27 PM
> >> To: Zhao1, Wei <wei.zhao1 at intel.com>; Zhang, Qi Z
> >> <qi.z.zhang at intel.com>; Laurent Hardy <laurent.hardy at 6wind.com>
> >> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >> <konstantin.ananyev at intel.com>; stable at dpdk.org; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >> fiber link update
> >>
> >> On 11.10.2018 12:56, Zhao1, Wei wrote:
> >>> Hi,  Ilya Maximets AND laurent.hardy
> >>
> >> Hi, thanks for sharing your thoughts.
> >> Comments inline.
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya Maximets
> >>>> Sent: Wednesday, September 12, 2018 4:05 PM
> >>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
> >>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >>>> <konstantin.ananyev at intel.com>; Laurent Hardy
> >>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
> >>>> stable at dpdk.org
> >>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >>>> fiber link update
> >>>>
> >>>> On 12.09.2018 09:49, Zhang, Qi Z wrote:
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> >>>>>> Sent: Monday, September 10, 2018 11:09 PM
> >>>>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
> >>>>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >>>>>> <konstantin.ananyev at intel.com>; Laurent Hardy
> >>>>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
> >>>>>> stable at dpdk.org
> >>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >>>>>> fiber link update
> >>>>>>
> >>>>>> On 04.09.2018 09:08, Zhang, Qi Z wrote:
> >>>>>>> Hi Ilya:
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya
> >>>>>>>> Maximets
> >>>>>>>> Sent: Friday, August 31, 2018 8:40 PM
> >>>>>>>> To: dev at dpdk.org
> >>>>>>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
> >>>>>>>> <konstantin.ananyev at intel.com>; Laurent Hardy
> >>>>>>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>; Ilya
> >>>>>>>> Maximets <i.maximets at samsung.com>; stable at dpdk.org
> >>>>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
> >>>>>>>> fiber link update
> >>>>>>>>
> >>>>>>>> If the multispeed fiber link is in DOWN state, ixgbe_setup_link
> >>>>>>>> could take around a second of busy polling. This is highly
> >>>>>>>> inconvenient for the case where single thread periodically
> >>>>>>>> checks the
> >>>> link statuses.
> >>>>>>>> For example, OVS main thread periodically updates the link
> >>>>>>>> statuses and hangs for a really long time busy waiting on
> >>>>>>>> ixgbe_setup_link() for a DOWN fiber ports. For case with 3 down
> >>>>>>>> ports it hangs for a 3 seconds and unable to do anything
> >>>>>>>> including
> >> packet processing.
> >>>>>>>> Fix that by shifting that workaround to a separate thread by
> >>>>>>>> alarm handler that will try to set up link if it is DOWN.
> >>>>>>>
> >>>>>>> Does that mean we will block the interrupt thread for 3 seconds?
> >>>>>>
> >>>>>> Three times for one second. Other work could be scheduled
> between.
> >>>>>> IMHO, it's much better than blocking usual caller for 3 seconds.
> >>>>>>
> >>>>>>> Also, can we guarantee there will not be any race condition if
> >>>>>>> we call
> >>>>>> ixgbe_setup_link at another thread, the base code API is not
> >>>>>> assumed to be thread-safe as I know.
> >>>>>>
> >>>>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but it
> >>>>>> could be called only if device stopped. 'ixgbe_dev_stop' cancels
> >>>>>> the
> >> alarm.
> >>>>>> Race with 'link_update' avoided by
> 'IXGBE_FLAG_NEED_LINK_CONFIG'
> >>>> flag.
> >>>>>
> >>>>> I guess, it' not only about when ixgb_setup_link race with itself,
> >>>>> but also
> >>>> when it race with other APIs.
> >>>>> Also the concern is, even in current version, we can prove there
> >>>>> is no issue,
> >>>> how can we guarantee we are safe for future base code update? It's
> >>>> not designed as thread-safe.
> >>>>> For my option, the change is risky.
> >>>>
> >>>> In current implementation interrupt handler already calls the
> >>>> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link'
> >>>> in our case if LSC interrupts enabled. So, my change makes the
> >>>> driver even safer by moving 'ixgbe_setup_link' to the same interrupt
> thread.
> >>>> Otherwise two threads (interrupts handler and the link status
> >>>> checking
> >>>> thread) could call 'ixgbe_setup_link' simultaneously.
> >>>>
> >>>>>
> >>>>> Btw, since ixgbe support LSC, it is not necessary for "single
> >>>>> thread
> >>>> periodically checks the link statuses", right?
> >>>>
> >>>> In current implementation it will take at least 5 seconds (4 + 1)
> >>>> for the interrupt handler to detect DOWN link state for ixgbe
> >>>> multispeed fiber. This is too much for many real world cases.
> >>>
> >>> I have reviewed  this patch, now I agree with you of the point that
> >>> when port is down, " main thread periodically updates the link
> >>> statuses and
> >> hangs for a really long time busy waiting on ixgbe_setup_link() for a
> >> DOWN fiber ports ".
> >>> This is introduced by a patch in the following:
> >>> SHA-1: c12d22f65b132c56db7b4fdbfd5ddce27d1e9572
> >>> * net/ixgbe: ensure link status is updated
> >>>
> >>> Because in this patch, ixgbe_setup_link() is called with input
> >>> parameter
> >> autoneg_wait_to_complete=1, this will cause loop check and sleep delay.
> >>> At least 82599 seems has this delay.(BTW, whivh type of NIC are you
> >>> use? X550 or 82599)
> >>
> >> I have 82599.
> >>
> >>> Your solution is add a eal_alarm_set for ixgbe_setup_link in the
> >>> thread of
> >> PMD driver, and do the set up work in that thread, is that right?
> >>> And main thread avoid hang by the flag of
> >> IXGBE_FLAG_NEED_LINK_CONFIG.
> >>> I think this is a good idea for this problem, but it may cause
> >>> problem for other legacy user of ixgbe pmd, because their legacy
> >>> code, which use
> >> main thread  to check link state and setup_link when port is down,
> >> and they are not aware of it is done by other thread if add your patch.
> >>
> >> What are these applications? I see no public API for setup_link function.
> >> It's internal to driver and should not be used externally.
> >> Am I missing something?
> >
> > rte_eth_link_get() ,  it will also call the function of ixgbe_setup_link().
> >
> 
> rte_eth_link_get() does not call ixgbe_setup_link().
> It only calls dev_ops->link_update().

No,  dev_ops->link_update call function ixgbe_dev_link_update()
-> ixgbe_dev_link_update_share() -> ixgbe_setup_link()



> 
> >
> >>
> >>>
> >>> And is that ok if we change code in ixgbe_dev_link_update_share() to
> >>>
> >>> ixgbe_dev_link_update_share()
> >>> {
> >>>
> >>> 	/* check if it needs to wait to complete, if lsc interrupt is enabled */
> >>> 	if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0)
> >>> 		wait = 0;
> >>>
> >>> 	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
> >>> 		ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> >>> 		speed = hw->phy.autoneg_advertised;
> >>> 		if (!speed)
> >>> 			ixgbe_get_link_capabilities(hw, &speed, &autoneg);
> >>> 		ixgbe_setup_link(hw, speed, wait);
> >>> 	}
> >>> }
> >>>
> >>> Then, your application can call rte_eth_link_get_nowait () to make
> >>> wait_to_complete=0 when doing periodic link state check, Which will
> >>> not
> >> cause  loop check and sleep delay. Legacy code of other user call
> >> rte_eth_link_get()  will not be affected also.
> >>> But, I am NOT confident ,whether this will introduce new problem
> >>> when
> >> set up link without wait!
> >>> So, this is just a discussion topic.
> >>
> >> Unfortunately this will not help. Take a look to the function
> >> 'ixgbe_setup_mac_link_multispeed_fiber()', which is the main
> >> problematic function here. 'wait_to_complete' here used only as
> >> argument for ixgbe_setup_mac_link(), and the busy waiting loops are
> outside of it.
> >> Regardless of the 'wait_to_complete' value, this function will busy
> >> poll the link for 1040 ms trying to setup 10GB speed and 140ms more
> >> trying to setup 1GB speed. After that, it will call itself
> >> recursively and wait again... Looks like I miscalculated last time.
> >> Right now it'll be more than 2 seconds for each down port since following
> patch merged:
> >> 8fc1f32fa615 ("net/ixgbe: wait longer for link after fiber MAC setup").
> >
> > Yes, you are right, link state check loop in function
> > ixgbe_setup_mac_link_multispeed_fiber() are not blocked by bool
> autoneg_wait_to_complete, It will cause about 1s wait when port is down.
> > And also, can we update code in function
> ixgbe_setup_mac_link_multispeed_fiber() to  block link state check loop
> using autoneg_wait_to_complete?
> > I am not sure. Because there is a comment for this loop check:
> > 		/*
> > 		 * Wait for the controller to acquire link.  Per IEEE 802.3ap,
> > 		 * Section 73.10.2, we may have to wait up to 500ms if KR is
> > 		 * attempted.  82599 uses the same timing for 10g SFI.
> > 		 */
> > It seems we have to wait for at least 500ms for spec requirement before
> we check link after configuration.
> > If that is true, we can not do any change to these loop check.
> > But why not main thread take some action to stop periodic link sate check
> when it find it has been hang or link is down?
> 
> To find that device is DOWN, thread will have to call this function at least
> once for each port and wait a few seconds.
> And how in that case we'll know that device is UP again?
> As I already wrote in discussion for this patch, LSC is not an option, because it
> takes at least 5 seconds to detect link state change, which is way too much
> for many real world apps.
> 
> >
> >
> >
> >>
> >>>
> >>> Hi, laurent.hardy
> >>>  You are the author for the patch (* net/ixgbe: ensure link status
> >>> is
> >> updated), why do you implement code that way?
> >>> Is that must that  set up link with wait?
> >>>
> >>> ixgbe_setup_link(hw, speed, true);
> >>>
> >>>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Qi
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is
> >>>>>>>> updated")
> >>>>>>>> CC: stable at dpdk.org
> >>>>>>>>
> >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
> >>>>>>>> ---
> >>>>>>>>  drivers/net/ixgbe/ixgbe_ethdev.c | 43
> >>>>>>>> ++++++++++++++++++++++++--------
> >>>>>>>>  1 file changed, 32 insertions(+), 11 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>>>> b/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>>>> index 26b192737..a33b9a6e8 100644
> >>>>>>>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>>>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> >>>>>>>> @@ -221,6 +221,8 @@ static int
> >>>>>>>> ixgbe_dev_interrupt_action(struct rte_eth_dev *dev,
> >>>>>>>>  				      struct rte_intr_handle *handle);
> >> static
> >>>> void
> >>>>>>>> ixgbe_dev_interrupt_handler(void *param);  static void
> >>>>>>>> ixgbe_dev_interrupt_delayed_handler(void *param);
> >>>>>>>> +static void ixgbe_dev_setup_link_alarm_handler(void *param);
> >>>>>>>> +
> >>>>>>>>  static int ixgbe_add_rar(struct rte_eth_dev *dev, struct
> >>>>>>>> ether_addr *mac_addr,
> >>>>>>>>  			 uint32_t index, uint32_t pool);  static void
> >>>>>>>> ixgbe_remove_rar(struct rte_eth_dev *dev, uint32_t index); @@
> >>>>>>>> -2791,6 +2793,8 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
> >>>>>>>>
> >>>>>>>>  	PMD_INIT_FUNC_TRACE();
> >>>>>>>>
> >>>>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler,
> >> dev);
> >>>>>>>> +
> >>>>>>>>  	/* disable interrupts */
> >>>>>>>>  	ixgbe_disable_intr(hw);
> >>>>>>>>
> >>>>>>>> @@ -3969,6 +3973,25 @@ ixgbevf_check_link(struct ixgbe_hw
> *hw,
> >>>>>>>> ixgbe_link_speed *speed,
> >>>>>>>>  	return ret_val;
> >>>>>>>>  }
> >>>>>>>>
> >>>>>>>> +static void
> >>>>>>>> +ixgbe_dev_setup_link_alarm_handler(void *param) {
> >>>>>>>> +	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
> >>>>>>>> +	struct ixgbe_hw *hw =
> >>>>>>>> IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>>>>>>> +	struct ixgbe_interrupt *intr =
> >>>>>>>> +		IXGBE_DEV_PRIVATE_TO_INTR(dev->data-
> >>> dev_private);
> >>>>>>>> +	u32 speed;
> >>>>>>>> +	bool autoneg = false;
> >>>>>>>> +
> >>>>>>>> +	speed = hw->phy.autoneg_advertised;
> >>>>>>>> +	if (!speed)
> >>>>>>>> +		ixgbe_get_link_capabilities(hw, &speed, &autoneg);
> >>>>>>>> +
> >>>>>>>> +	ixgbe_setup_link(hw, speed, true);
> >>>>>>>> +
> >>>>>>>> +	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; }
> >>>>>>>> +
> >>>>>>>>  /* return 0 means link status changed, -1 means not changed */
> >>>>>>>> int ixgbe_dev_link_update_share(struct rte_eth_dev *dev, @@ -
> >>>> 3981,9
> >>>>>>>> +4004,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> >> *dev,
> >>>>>>>>  		IXGBE_DEV_PRIVATE_TO_INTR(dev->data-
> >>> dev_private);
> >>>>>>>>  	int link_up;
> >>>>>>>>  	int diag;
> >>>>>>>> -	u32 speed = 0;
> >>>>>>>>  	int wait = 1;
> >>>>>>>> -	bool autoneg = false;
> >>>>>>>>
> >>>>>>>>  	memset(&link, 0, sizeof(link));
> >>>>>>>>  	link.link_status = ETH_LINK_DOWN; @@ -3993,13 +4014,8
> >> @@
> >>>>>>>> ixgbe_dev_link_update_share(struct
> >>>>>> rte_eth_dev
> >>>>>>>> *dev,
> >>>>>>>>
> >>>>>>>>  	hw->mac.get_link_status = true;
> >>>>>>>>
> >>>>>>>> -	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
> >>>>>>>> -		ixgbe_get_media_type(hw) ==
> >> ixgbe_media_type_fiber) {
> >>>>>>>> -		speed = hw->phy.autoneg_advertised;
> >>>>>>>> -		if (!speed)
> >>>>>>>> -			ixgbe_get_link_capabilities(hw, &speed,
> >> &autoneg);
> >>>>>>>> -		ixgbe_setup_link(hw, speed, true);
> >>>>>>>> -	}
> >>>>>>>> +	if (intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG)
> >>>>>>>> +		return rte_eth_linkstatus_set(dev, &link);
> >>>>>>>>
> >>>>>>>>  	/* check if it needs to wait to complete, if lsc interrupt is
> >> enabled */
> >>>>>>>>  	if (wait_to_complete == 0 || dev->data-
> >>> dev_conf.intr_conf.lsc
> >>>>>>>> !=
> >>>>>>>> 0) @@
> >>>>>>>> -4017,11 +4033,14 @@ ixgbe_dev_link_update_share(struct
> >>>> rte_eth_dev
> >>>>>> *dev,
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>>  	if (link_up == 0) {
> >>>>>>>> -		intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>>>> +		if (ixgbe_get_media_type(hw) ==
> >> ixgbe_media_type_fiber)
> >>>> {
> >>>>>>>> +			intr->flags |=
> >> IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>>>> +			rte_eal_alarm_set(10,
> >>>>>>>> +
> >> 	ixgbe_dev_setup_link_alarm_handler, dev);
> >>>>>>>> +		}
> >>>>>>>>  		return rte_eth_linkstatus_set(dev, &link);
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>> -	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
> >>>>>>>>  	link.link_status = ETH_LINK_UP;
> >>>>>>>>  	link.link_duplex = ETH_LINK_FULL_DUPLEX;
> >>>>>>>>
> >>>>>>>> @@ -5128,6 +5147,8 @@ ixgbevf_dev_stop(struct rte_eth_dev
> >> *dev)
> >>>>>>>>
> >>>>>>>>  	PMD_INIT_FUNC_TRACE();
> >>>>>>>>
> >>>>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler,
> >> dev);
> >>>>>>>> +
> >>>>>>>>  	ixgbevf_intr_disable(dev);
> >>>>>>>>
> >>>>>>>>  	hw->adapter_stopped = 1;
> >>>>>>>> --
> >>>>>>>> 2.17.1
> >>>>>>>


More information about the dev mailing list