[dpdk-dev] [PATCH] igb_uio: revert open and release operations

Tan, Jianfeng jianfeng.tan at intel.com
Fri Oct 20 17:26:15 CEST 2017


Hi Ferruh & Harish,


On 10/20/2017 9:15 AM, Ferruh Yigit wrote:
> On 10/19/2017 3:43 PM, Patil, Harish wrote:
>> -----Original Message-----
>> From: Harish Patil <Harish.Patil at cavium.com>
>> Date: Tuesday, October 17, 2017 at 9:50 PM
>> To: Thomas Monjalon <thomas at monjalon.net>, Ferruh Yigit
>> <ferruh.yigit at intel.com>
>> Cc: "dev at dpdk.org" <dev at dpdk.org>, Jianfeng Tan <jianfeng.tan at intel.com>,
>> Jingjing Wu <jingjing.wu at intel.com>, "Thotton, Shijith"
>> <Shijith.Thotton at cavium.com>, Gregory Etelson <gregory at weka.io>, George
>> Prekas <george.prekas at epfl.ch>, "stable at dpdk.org" <stable at dpdk.org>
>> Subject: Re: [PATCH] igb_uio: revert open and release operations
>>>
>>> -----Original Message-----
>>> From: Thomas Monjalon <thomas at monjalon.net>
>>> Date: Tuesday, October 17, 2017 at 1:33 PM
>>> To: Ferruh Yigit <ferruh.yigit at intel.com>, Harish Patil
>>> <Harish.Patil at cavium.com>
>>> Cc: "dev at dpdk.org" <dev at dpdk.org>, Jianfeng Tan <jianfeng.tan at intel.com>,
>>> Jingjing Wu <jingjing.wu at intel.com>, "Thotton, Shijith"
>>> <Shijith.Thotton at cavium.com>, Gregory Etelson <gregory at weka.io>, George
>>> Prekas <george.prekas at epfl.ch>, "stable at dpdk.org" <stable at dpdk.org>
>>> Subject: Re: [PATCH] igb_uio: revert open and release operations
>>>
>>>> 17/10/2017 22:14, Ferruh Yigit:
>>>>> There were bug reports about terminated application may leave device in
>>>>> undesired state:
>>>>> http://dpdk.org/ml/archives/dev/2016-November/049745.html
>>>>> http://dpdk.org/ml/archives/dev/2016-November/050932.html
>>>>>
>>>>> And a proposal to fix:
>>>>> http://dpdk.org/ml/archives/dev/2016-December/051844.html
>>>>>
>>>>> Later another proposal triggered the discussion:
>>>>> http://dpdk.org/ml/archives/dev/2017-May/066317.html
>>>>>
>>>>> Finally a fix patch pushed into v17.08:
>>>>> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
>>>>> device file")
>>>>>
>>>>> Later a regression report sent related to the pushed patch:
>>>>> http://dpdk.org/ml/archives/dev/2017-September/075236.html
>>>>>
>>>>> And a fix for regression integrated into v17.11-rc1:
>>>>> http://dpdk.org/ml/archives/dev/2017-October/079166.html
>>>>> Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in
>>>>> VM")
>>>>> Commit: 6b9ed026a870 ("igb_uio: fix build with kernel <= 3.17")
>>>>>
>>>>> Even after the fix qede PMD reported to be broken:
>>>>> http://dpdk.org/ml/archives/dev/2017-October/079359.html
>>>>>
>>>>> So this patch reverts original fix and related commits. The related
>>>>> igb_uio code part turns back to v17.05 base.
>>>> [...]
>>>>> ---
>>>>> It would be nice to solve this issue in LTS release, but being close to
>>>>> the release and the error report without details makes it hard to work
>>>>> more on this issue.
>>>> With this revert, we are back to the original issue.
>>>> We must really try the proposed solution.
>>>> Harish, please describe your issue and think how it could be fixed.
>>>> Jingjing made it work for i40e.
>>>> I know it is less effort to request a simple revert.
>>>> Please let's try to fix it once for all.
>>> Hi Ferruh/Thomas,
>>> I’m discussing it internally, so please hold on committing this patch till
>>> I revert back to you.
>>>
>>> Thanks.
>>> [Harish]
>>> 1) With the introduction of:
>>> Commit: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
>>> device file”)
>>     We saw failures with qede PF & SR-IOV VF initialization in PCI passthru
>> mode.
>>
>>> PF PCI passthru mode initialization failure was resolved by:
>>> “Commit: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in
>>> VM")
> Thank you for the update.
>
>>> SR-IOV VF PCI passthru mode initialization issue is that PCI FLR and
>>> related device cleanup is not completed by the time VF driver starts
>>> loading. It results in the mbox command failure sent over the HW channel
>>> between VF and PF.
> This seems same reason why i40e added a check and wait loop.
>
>>> Even though pci_reset_function() waits for the stipulated amount of time
>>> per standards, VF FLR takes longer than that and pci_reset_function() &
>>> igb_uio_open() call returns before FLR completes and VF PMD driver tries
>>> to load before FLR completes leading to VF PMD initialization failure.
>>>
>>>
>>> We can work around this problem by adding driver delay/retry logic since
>>> there is no deterministic way of detecting FLR completion. But this is
>>> going to increase the driver load time.
>>>
>>> 2) With the above patch ("igb_uio: issue FLR during open and release of
>>> device file), FLR is going to be issued unconditionally on all devices
>>> during igb_uio_open. We think it’s an over kill. FLR is required only for
>>> previous abnormal app termination. We already handle the abnormal app
>>> termination by doing necessary cleanup in the driver during load. This
>>> cleanup is more efficient as it is done only when required. So we feel
>>> that the drivers/devices needing such cleanup (the two cases listed
>>> below)  should do it conditionally when required rather than
>>> igb_uio_open() unconditionally performing FLR all the time.
>>>
>>> e.g.,
>>     - cdb166963cae ("net/liquidio: add API for VF FLR”)
> Both 1) and 2) related to the pci_reset during open().
> But the main functionality we are looking for is the pci_reset in release().
> So we can remove reset during open() [1].
>
> Will disabling pci_reset in open() [2] solve your problems?
>
> Jingjing, Jianfeng,
> What do you think?

If [2] can work for all drivers, I agree with this option.

Thanks,
Jianfeng

>
>
> [1]
> Perhaps will need to revert or partially revert liquidio patch after this.
>
> [2]
> disable line "pci_reset_function(dev);" in the igbuio_pci_open()
> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/igb_uio/igb_uio.c#n339
>



More information about the dev mailing list