[dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF
Olivier Matz
olivier.matz at 6wind.com
Mon May 16 14:01:28 CEST 2016
Hi Wenzhuo,
On 05/04/2016 11:10 PM, Wenzhuo Lu wrote:
> When the physical link is down and recover later,
> the VF link cannot recover until the user stop and
> start it manually.
> This patch implements the automatic recovery of VF
> port.
> The automatic recovery bases on the link up/down
> message received from PF. When VF receives the link
> up/down message, it will replace the RX/TX and
> operation functions with fake ones to stop RX/TX
> and any future operation. Then reset the VF port.
> After successfully resetting the port, recover the
> RX/TX and operation functions.
>
> Signed-off-by: Wenzhuo Lu <wenzhuo.lu at intel.com>
>
> [...]
>
> +void
> +ixgbevf_dev_link_up_down_handler(struct rte_eth_dev *dev)
> +{
> + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct ixgbe_adapter *adapter =
> + (struct ixgbe_adapter *)dev->data->dev_private;
> + int diag;
> + uint32_t vteiam;
> +
> + /* Only one working core need to performance VF reset */
> + if (rte_spinlock_trylock(&adapter->vf_reset_lock)) {
> + /**
> + * When fake rec/xmit is replaced, working thread may is running
> + * into real RX/TX func, so wait long enough to assume all
> + * working thread exit. The assumption is it will spend less
> + * than 100us for each execution of RX and TX func.
> + */
> + rte_delay_us(100);
> +
> + do {
> + dev->data->dev_started = 0;
> + ixgbevf_dev_stop(dev);
> + rte_delay_us(1000000);
If I understand well, ixgbevf_dev_link_up_down_handler() is called
by ixgbevf_recv_pkts_fake() on a dataplane core. It means that the
core that acquired the lock will loop during 100us + 1sec at least.
If this core was also in charge of polling other queues of other
ports, or timers, many packets will be dropped (even with a 100us
loop). I don't think it is acceptable to actively wait inside a
rx function.
I think it would avoid many issues to delegate this work to the
application, maybe by notifying it that the port is in a bad state
and must be restarted. The application could then properly stop
polling the queues, and stop and restart the port in a separate thread,
without bothering the dataplane cores.
Regards,
Olivier
More information about the dev
mailing list