[dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory

Ferruh Yigit ferruh.yigit at intel.com
Wed Jan 15 19:35:18 CET 2020


On 1/15/2020 6:49 AM, 方统浩50450 wrote:
> Hi Ferruh, thanks for your message.
> 
> 
> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
> secondary process will change the shared memory when initializing.Secondary process calls
> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
> process.

I agree this is the problem.
In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
but the generic code is faulty.

And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.

> 
> 
> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
> 
> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
> I think the second way is simple and lower risk.

Yes these are the two options.

I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
BUT my concern was adding decision making to simple/leaf function and make it
harder to debug/use, instead of giving what primary/secondary process should
call decision in higher level.

But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
secondary process, like mlx4 or szedata2, and most probably this is not their
intention.
And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
function may have side affect of 'eth_dev->intr_handle' not set in secondary.

With above considerations I am OK to your proposal to cover all cases, Thomas,
Andrew, any concern?

@Fang, only can you please make a new version to update the
'rte_eth_copy_pci_info' function comment to document shared data is not updated
for the secondary process?

Thanks,
ferruh

> 
> 
> Please forgive me because my poor english....
> 
> 
> 
> 发件人:Ferruh Yigit <ferruh.yigit at intel.com>
> 发送日期:2020-01-14 22:45:33
> 收件人:Fang TongHao <fangtonghao at sangfor.com.cn>,thomas at monjalon.net,arybchenko at solarflare.com
> 抄送人:dev at dpdk.org,stable at dpdk.org,jia.guo at intel.com,cunming.liang at intel.com,qi.z.zhang at intel.com,jungle845943968 at outlook.com
> 主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/13/2020 5:03 AM, Fang TongHao wrote:
>>> Secondary process calls “rte_eth_dev_pci_allocate”
>>> function and enters rte_eth_copy_pci_info function
>>> when initializing.Then it sets the value of struct
>>> "rte_eth_dev_data.dev_flags" to zero and reset it,
>>> but this struct is shared by primary process and
>>> secondary process.To fix this bug,by adding an
>>> if-statement to forbid the secondaryprocess changing
>>> the above-mentioned value.
>>
>> Hi Fang,
>>
>> Thanks for the fix, I agree with the problem statement, but not sure if this
>> should be handled in the helper function or in the place where the function is
>> called. Helper function is simple on what it does, do we need to put the primary
>> process logic in it.
>>
>> Can you please give more details of the bug you have encounter, is it seen by a
>> specific PMD?
>>
>> Thanks,
>> ferruh
>>
>>>
>>> Signed-off-by: Fang TongHao <fangtonghao at sangfor.com.cn>
>>> ---
>>>  lib/librte_ethdev/rte_ethdev_pci.h | 18 ++++++++++--------
>>>  1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/lib/librte_ethdev/rte_ethdev_pci.h b/lib/librte_ethdev/rte_ethdev_pci.h
>>> index ccdbb46ec..e7dae0545 100644
>>> --- a/lib/librte_ethdev/rte_ethdev_pci.h
>>> +++ b/lib/librte_ethdev/rte_ethdev_pci.h
>>> @@ -60,14 +60,16 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev,
>>>  
>>>  	eth_dev->intr_handle = &pci_dev->intr_handle;
>>>  
>>> -	eth_dev->data->dev_flags = 0;
>>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>>> -
>>> -	eth_dev->data->kdrv = pci_dev->kdrv;
>>> -	eth_dev->data->numa_node = pci_dev->device.numa_node;
>>> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>>> +		eth_dev->data->dev_flags = 0;
>>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>>> +
>>> +		eth_dev->data->kdrv = pci_dev->kdrv;
>>> +		eth_dev->data->numa_node = pci_dev->device.numa_node;
>>> +	}
>>>  }
>>>  
>>>  static inline int
>>>
>>
> 
> 
> 
> 



More information about the dev mailing list