[dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link update

Laurent Hardy laurent.hardy at 6wind.com
Thu Oct 11 14:21:44 CEST 2018


Hi Wei,

Please see comments below.


On 10/11/2018 12:26 PM, Ilya Maximets wrote:
> On 11.10.2018 12:56, Zhao1, Wei wrote:
>> Hi,  Ilya Maximets AND laurent.hardy
> Hi, thanks for sharing your thoughts.
> Comments inline.
>
>>
>>> -----Original Message-----
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya Maximets
>>> Sent: Wednesday, September 12, 2018 4:05 PM
>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
>>> <konstantin.ananyev at intel.com>; Laurent Hardy
>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
>>> stable at dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link
>>> update
>>>
>>> On 12.09.2018 09:49, Zhang, Qi Z wrote:
>>>>
>>>>> -----Original Message-----
>>>>> From: Ilya Maximets [mailto:i.maximets at samsung.com]
>>>>> Sent: Monday, September 10, 2018 11:09 PM
>>>>> To: Zhang, Qi Z <qi.z.zhang at intel.com>; dev at dpdk.org
>>>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
>>>>> <konstantin.ananyev at intel.com>; Laurent Hardy
>>>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>;
>>>>> stable at dpdk.org
>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while
>>>>> fiber link update
>>>>>
>>>>> On 04.09.2018 09:08, Zhang, Qi Z wrote:
>>>>>> Hi Ilya:
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ilya Maximets
>>>>>>> Sent: Friday, August 31, 2018 8:40 PM
>>>>>>> To: dev at dpdk.org
>>>>>>> Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Ananyev, Konstantin
>>>>>>> <konstantin.ananyev at intel.com>; Laurent Hardy
>>>>>>> <laurent.hardy at 6wind.com>; Dai, Wei <wei.dai at intel.com>; Ilya
>>>>>>> Maximets <i.maximets at samsung.com>; stable at dpdk.org
>>>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber
>>>>>>> link update
>>>>>>>
>>>>>>> If the multispeed fiber link is in DOWN state, ixgbe_setup_link
>>>>>>> could take around a second of busy polling. This is highly
>>>>>>> inconvenient for the case where single thread periodically checks the
>>> link statuses.
>>>>>>> For example, OVS main thread periodically updates the link statuses
>>>>>>> and hangs for a really long time busy waiting on ixgbe_setup_link()
>>>>>>> for a DOWN fiber ports. For case with 3 down ports it hangs for a 3
>>>>>>> seconds and unable to do anything including packet processing.
>>>>>>> Fix that by shifting that workaround to a separate thread by alarm
>>>>>>> handler that will try to set up link if it is DOWN.
>>>>>> Does that mean we will block the interrupt thread for 3 seconds?
>>>>> Three times for one second. Other work could be scheduled between.
>>>>> IMHO, it's much better than blocking usual caller for 3 seconds.
>>>>>
>>>>>> Also, can we guarantee there will not be any race condition if we
>>>>>> call
>>>>> ixgbe_setup_link at another thread, the base code API is not assumed
>>>>> to be thread-safe as I know.
>>>>>
>>>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but it
>>>>> could be called only if device stopped. 'ixgbe_dev_stop' cancels the alarm.
>>>>> Race with 'link_update' avoided by 'IXGBE_FLAG_NEED_LINK_CONFIG'
>>> flag.
>>>> I guess, it' not only about when ixgb_setup_link race with itself, but also
>>> when it race with other APIs.
>>>> Also the concern is, even in current version, we can prove there is no issue,
>>> how can we guarantee we are safe for future base code update? It's not
>>> designed as thread-safe.
>>>> For my option, the change is risky.
>>> In current implementation interrupt handler already calls the
>>> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link'
>>> in our case if LSC interrupts enabled. So, my change makes the driver even
>>> safer by moving 'ixgbe_setup_link' to the same interrupt thread.
>>> Otherwise two threads (interrupts handler and the link status checking
>>> thread) could call 'ixgbe_setup_link' simultaneously.
>>>
>>>> Btw, since ixgbe support LSC, it is not necessary for "single thread
>>> periodically checks the link statuses", right?
>>>
>>> In current implementation it will take at least 5 seconds (4 + 1) for the
>>> interrupt handler to detect DOWN link state for ixgbe multispeed fiber. This
>>> is too much for many real world cases.
>> I have reviewed  this patch, now I agree with you of the point that when port is down, " main thread
>> periodically updates the link statuses and hangs for a really long time busy waiting on ixgbe_setup_link() for a DOWN fiber ports ".
>> This is introduced by a patch in the following:
>> SHA-1: c12d22f65b132c56db7b4fdbfd5ddce27d1e9572
>> * net/ixgbe: ensure link status is updated
>>
>> Because in this patch, ixgbe_setup_link() is called with input parameter autoneg_wait_to_complete=1, this will cause loop check and sleep delay.
>> At least 82599 seems has this delay.(BTW, whivh type of NIC are you use? X550 or 82599)
> I have 82599.
>
>> Your solution is add a eal_alarm_set for ixgbe_setup_link in the thread of PMD driver, and do the set up work in that thread, is that right?
>> And main thread avoid hang by the flag of IXGBE_FLAG_NEED_LINK_CONFIG.
>> I think this is a good idea for this problem, but it may cause problem for other legacy user of ixgbe pmd, because their legacy code,
>> which use main thread  to check link state and setup_link when port is down, and they are not aware of it is done by other thread if add your patch.
> What are these applications? I see no public API for setup_link function.
> It's internal to driver and should not be used externally.
> Am I missing something?
>
>> And is that ok if we change code in ixgbe_dev_link_update_share() to
>>
>> ixgbe_dev_link_update_share()
>> {
>>
>> 	/* check if it needs to wait to complete, if lsc interrupt is enabled */
>> 	if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0)
>> 		wait = 0;
>>
>> 	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
>> 		ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
>> 		speed = hw->phy.autoneg_advertised;
>> 		if (!speed)
>> 			ixgbe_get_link_capabilities(hw, &speed, &autoneg);
>> 		ixgbe_setup_link(hw, speed, wait);
>> 	}
>> }
>>
>> Then, your application can call rte_eth_link_get_nowait () to make wait_to_complete=0 when doing periodic link state check,
>> Which will not cause  loop check and sleep delay. Legacy code of other user call rte_eth_link_get()  will not be affected also.
>> But, I am NOT confident ,whether this will introduce new problem when set up link without wait!
>> So, this is just a discussion topic.
> Unfortunately this will not help. Take a look to the function
> 'ixgbe_setup_mac_link_multispeed_fiber()', which is the main problematic
> function here. 'wait_to_complete' here used only as argument for
> ixgbe_setup_mac_link(), and the busy waiting loops are outside of it.
> Regardless of the 'wait_to_complete' value, this function will busy
> poll the link for 1040 ms trying to setup 10GB speed and 140ms more
> trying to setup 1GB speed. After that, it will call itself recursively
> and wait again... Looks like I miscalculated last time. Right now it'll
> be more than 2 seconds for each down port since following patch merged:
> 8fc1f32fa615 ("net/ixgbe: wait longer for link after fiber MAC setup").
>
>> Hi, laurent.hardy
>>   You are the author for the patch (* net/ixgbe: ensure link status is updated), why do you implement code that way?
>> Is that must that  set up link with wait?
>>
>> ixgbe_setup_link(hw, speed, true);
>>
The main issue which has lead to this patch has been reported through a 
test case with the autoneg enabled (which has been also reported in the 
pmd test provided along with the patch: 
http://patches.dpdk.org/comment/46253/).
In this context, without the flag set the patch wasn't effective.
>>
>>>>>> Regards
>>>>>> Qi
>>>>>>
>>>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated")
>>>>>>> CC: stable at dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
>>>>>>> ---
>>>>>>>   drivers/net/ixgbe/ixgbe_ethdev.c | 43
>>>>>>> ++++++++++++++++++++++++--------
>>>>>>>   1 file changed, 32 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
>>>>>>> b/drivers/net/ixgbe/ixgbe_ethdev.c
>>>>>>> index 26b192737..a33b9a6e8 100644
>>>>>>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>>>>>>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>>>>>>> @@ -221,6 +221,8 @@ static int ixgbe_dev_interrupt_action(struct
>>>>>>> rte_eth_dev *dev,
>>>>>>>   				      struct rte_intr_handle *handle);  static
>>> void
>>>>>>> ixgbe_dev_interrupt_handler(void *param);  static void
>>>>>>> ixgbe_dev_interrupt_delayed_handler(void *param);
>>>>>>> +static void ixgbe_dev_setup_link_alarm_handler(void *param);
>>>>>>> +
>>>>>>>   static int ixgbe_add_rar(struct rte_eth_dev *dev, struct
>>>>>>> ether_addr *mac_addr,
>>>>>>>   			 uint32_t index, uint32_t pool);  static void
>>>>>>> ixgbe_remove_rar(struct rte_eth_dev *dev, uint32_t index); @@
>>>>>>> -2791,6 +2793,8 @@ ixgbe_dev_stop(struct rte_eth_dev *dev)
>>>>>>>
>>>>>>>   	PMD_INIT_FUNC_TRACE();
>>>>>>>
>>>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, dev);
>>>>>>> +
>>>>>>>   	/* disable interrupts */
>>>>>>>   	ixgbe_disable_intr(hw);
>>>>>>>
>>>>>>> @@ -3969,6 +3973,25 @@ ixgbevf_check_link(struct ixgbe_hw *hw,
>>>>>>> ixgbe_link_speed *speed,
>>>>>>>   	return ret_val;
>>>>>>>   }
>>>>>>>
>>>>>>> +static void
>>>>>>> +ixgbe_dev_setup_link_alarm_handler(void *param) {
>>>>>>> +	struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
>>>>>>> +	struct ixgbe_hw *hw =
>>>>>>> IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>>>>>>> +	struct ixgbe_interrupt *intr =
>>>>>>> +		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
>>>>>>> +	u32 speed;
>>>>>>> +	bool autoneg = false;
>>>>>>> +
>>>>>>> +	speed = hw->phy.autoneg_advertised;
>>>>>>> +	if (!speed)
>>>>>>> +		ixgbe_get_link_capabilities(hw, &speed, &autoneg);
>>>>>>> +
>>>>>>> +	ixgbe_setup_link(hw, speed, true);
>>>>>>> +
>>>>>>> +	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; }
>>>>>>> +
>>>>>>>   /* return 0 means link status changed, -1 means not changed */
>>>>>>> int ixgbe_dev_link_update_share(struct rte_eth_dev *dev, @@ -
>>> 3981,9
>>>>>>> +4004,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
>>>>>>>   		IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
>>>>>>>   	int link_up;
>>>>>>>   	int diag;
>>>>>>> -	u32 speed = 0;
>>>>>>>   	int wait = 1;
>>>>>>> -	bool autoneg = false;
>>>>>>>
>>>>>>>   	memset(&link, 0, sizeof(link));
>>>>>>>   	link.link_status = ETH_LINK_DOWN; @@ -3993,13 +4014,8 @@
>>>>>>> ixgbe_dev_link_update_share(struct
>>>>> rte_eth_dev
>>>>>>> *dev,
>>>>>>>
>>>>>>>   	hw->mac.get_link_status = true;
>>>>>>>
>>>>>>> -	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
>>>>>>> -		ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
>>>>>>> -		speed = hw->phy.autoneg_advertised;
>>>>>>> -		if (!speed)
>>>>>>> -			ixgbe_get_link_capabilities(hw, &speed, &autoneg);
>>>>>>> -		ixgbe_setup_link(hw, speed, true);
>>>>>>> -	}
>>>>>>> +	if (intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG)
>>>>>>> +		return rte_eth_linkstatus_set(dev, &link);
>>>>>>>
>>>>>>>   	/* check if it needs to wait to complete, if lsc interrupt is enabled */
>>>>>>>   	if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc !=
>>>>>>> 0) @@
>>>>>>> -4017,11 +4033,14 @@ ixgbe_dev_link_update_share(struct
>>> rte_eth_dev
>>>>> *dev,
>>>>>>>   	}
>>>>>>>
>>>>>>>   	if (link_up == 0) {
>>>>>>> -		intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
>>>>>>> +		if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber)
>>> {
>>>>>>> +			intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
>>>>>>> +			rte_eal_alarm_set(10,
>>>>>>> +				ixgbe_dev_setup_link_alarm_handler, dev);
>>>>>>> +		}
>>>>>>>   		return rte_eth_linkstatus_set(dev, &link);
>>>>>>>   	}
>>>>>>>
>>>>>>> -	intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
>>>>>>>   	link.link_status = ETH_LINK_UP;
>>>>>>>   	link.link_duplex = ETH_LINK_FULL_DUPLEX;
>>>>>>>
>>>>>>> @@ -5128,6 +5147,8 @@ ixgbevf_dev_stop(struct rte_eth_dev *dev)
>>>>>>>
>>>>>>>   	PMD_INIT_FUNC_TRACE();
>>>>>>>
>>>>>>> +	rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, dev);
>>>>>>> +
>>>>>>>   	ixgbevf_intr_disable(dev);
>>>>>>>
>>>>>>>   	hw->adapter_stopped = 1;
>>>>>>> --
>>>>>>> 2.17.1



More information about the dev mailing list