[dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF

Olivier Matz olivier.matz at 6wind.com
Mon May 16 14:01:28 CEST 2016


Hi Wenzhuo,

On 05/04/2016 11:10 PM, Wenzhuo Lu wrote:
> When the physical link is down and recover later,
> the VF link cannot recover until the user stop and
> start it manually.
> This patch implements the automatic recovery of VF
> port.
> The automatic recovery bases on the link up/down
> message received from PF. When VF receives the link
> up/down message, it will replace the RX/TX and
> operation functions with fake ones to stop RX/TX
> and any future operation. Then reset the VF port.
> After successfully resetting the port, recover the
> RX/TX and operation functions.
> 
> Signed-off-by: Wenzhuo Lu <wenzhuo.lu at intel.com>
> 
> [...]
> 
> +void
> +ixgbevf_dev_link_up_down_handler(struct rte_eth_dev *dev)
> +{
> +	struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +	struct ixgbe_adapter *adapter =
> +		(struct ixgbe_adapter *)dev->data->dev_private;
> +	int diag;
> +	uint32_t vteiam;
> +
> +	/* Only one working core need to performance VF reset */
> +	if (rte_spinlock_trylock(&adapter->vf_reset_lock)) {
> +		/**
> +		 * When fake rec/xmit is replaced, working thread may is running
> +		 * into real RX/TX func, so wait long enough to assume all
> +		 * working thread exit. The assumption is it will spend less
> +		 * than 100us for each execution of RX and TX func.
> +		 */
> +		rte_delay_us(100);
> +
> +		do {
> +			dev->data->dev_started = 0;
> +			ixgbevf_dev_stop(dev);
> +			rte_delay_us(1000000);

If I understand well, ixgbevf_dev_link_up_down_handler() is called
by ixgbevf_recv_pkts_fake() on a dataplane core. It means that the
core that acquired the lock will loop during 100us + 1sec at least.
If this core was also in charge of polling other queues of other
ports, or timers, many packets will be dropped (even with a 100us
loop). I don't think it is acceptable to actively wait inside a
rx function.

I think it would avoid many issues to delegate this work to the
application, maybe by notifying it that the port is in a bad state
and must be restarted. The application could then properly stop
polling the queues, and stop and restart the port in a separate thread,
without bothering the dataplane cores.


Regards,
Olivier


More information about the dev mailing list