[dpdk-dev] link state change not consistent (Flaps/stays DOWN/UP for ever)

Yeddula, Avinash ayeddula at ciena.com
Wed Aug 23 02:46:23 CEST 2017


We are seeing similar behavior with testPmd as well.


  1.  Keep port  0/1/2/3 in  Up/Up/Down/Down status to start the actual test.


  1.  Following commands will do link status change twice on each port. Hence we should see the same status as in (1).

  1.  Testpmd: port stop 0;port start 0;port stop 1;port start 1;port start 2; port stop 2; port start 3;port stop 3

  1.  Repeat step 2, 3 -5 times and app gets into fault state and never recovers (One of the 10G ports, goes into "DOWN" state and never comes UP again".

Thanks
-Avinash

From: Yeddula, Avinash
Sent: Tuesday, August 22, 2017 3:08 PM
To: dev at dpdk.org
Cc: Gajarampalli, Prasanth <pgajaram at ciena.com>
Subject: link state change not consistent (Flaps/stays DOWN/UP for ever)

Hi All,

With DPDK Stable 17.05 branch, we are seeing an issue with the link state of 10G ports on Broadwell.
Link status of one of the ports keeps flapping at regular intervals, typically starts after few seconds after the link appeared to settle down.
Other behaviors include not honoring a link state change set by the app. So far this is observed with ixgbe only, but have not yet explored if the issue is existing across drivers.
Also this issue was not seen with 16.11.2. Is this a known issue? If yes, are there any patches that can fix it?

Few patches (between 17.05 and 17.08) that were applied and not found any difference in behavior are below.

net/ixgbe: improve link state check on VF
In current implementation, when checking VF link state, PF state
is checked too, although the function has a parameter to tell
if PF state checking is needed.
But in some scenario, user may not care about the PF state.
This patch enables the unused parameter to only check the VF
link state.

net/ixgbe: fix LSC interrupt
If LSC flag is changed to off at last device start, the
enable flag is not cleared in HW.
This patch fixes it.

net/e1000: fix LSC interrupt
If LSC flag is changed to off at last device start, the
enable flag is not cleared in HW.
This patch fixes it.

ethdev: add deferred intermediate device state
This device state means that the device is managed externally, by
whichever party has set this state (PMD or application).

Note: this new device state is only an information. The related device
structure and operators are still valid and can be used normally.

It is however made private by device management helpers within ethdev,
making the device invisible to applications.

ethdev: count devices consistently
Make the rte_eth_dev_count() return the number of available devices even
after some are detached by the hotplug API or put in a deferred state.

net/ixgbe: fix Rx/Tx queue interrupt for x550 devices
x550 devices don't map interrupt vector before enabling Rx/Tx queue
interrupt.
Because of this interrupt mode is not working for x550 devices.

igb_uio: issue FLR during open and release of device file
Set UIO info device file operations open and release. Call pci reset
function inside open and release to clear device state at start and end.
Copied this behaviour from vfio_pci kernel module code. With this patch,
it is not mandatory to issue FLR by PMD's during init and close.

Bus master enable and disable are added in open and release respectively
to take care of device DMA.

ethdev: fix device state on detach
The device state should be handled by the ethdev layer when possible.
Applications should not have to do it.

Not setting the state to UNUSED will make the port_id of the device
valid for all ethdev API functions, usually resulting in segfault.




More information about the dev mailing list