[dpdk-dev] [dpdk-stable] [PATCH 2/2] app/testpmd: fix invalid port detaching

Ferruh Yigit ferruh.yigit at intel.com
Thu Jan 23 15:48:10 CET 2020


On 1/23/2020 2:05 PM, Matan Azrad wrote:
> Hi
> 
> From: Yigit, Ferruh
>> On 11/12/2019 8:47 AM, Matan Azrad wrote:
>>> The port was not validated before detaching.
>>>
>>> Ignore port detach operation when the port is not valid.
>>>
>>> Fixes: f8e5baa2662d ("app/testpmd: check not detaching device twice")
>>> Cc: thomas at monjalon.net
>>> Cc: stable at dpdk.org
>>>
>>> Signed-off-by: Matan Azrad <matan at mellanox.com>
>>> ---
>>>  app/test-pmd/testpmd.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
>>> 4444346..370eefe 100644
>>> --- a/app/test-pmd/testpmd.c
>>> +++ b/app/test-pmd/testpmd.c
>>> @@ -2545,6 +2545,9 @@ struct extmem_param {
>>>
>>>  	printf("Removing a device...\n");
>>>
>>> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
>>> +		return;
>>> +
>>>  	dev = rte_eth_devices[port_id].device;
>>>  	if (dev == NULL) {
>>>  		printf("Device already removed\n");
>>>
>>
>> The patch is already in 19.11 [1] but it is breaking the testpmd hotplug
>> support.
>> Before 'detach_port_device()' called, the port has been stopped and closed
>> [2], which will make port fail from 'port_id_is_invalid()' check and the device
>> removal path never fully called.
>> The implication is, since device not detached, vfio request interrupt keeps
>> triggered continuously and re-starts the detach path, but because of the half
>> cleaned device it fails and app gets stuck with a continuous log [3].
>>
>> I wonder if the actual hotplug has been tested with this patch, the commit
>> log is not clear about the motivation and implication of the patch, I am not
>> clear why this check is added but I am sending a patch soon to remove it
>> back.
> 
> The motivation of this patch was to prevent double detach on same port, so the user cannot call detach of invalid port.

What is the definition of the 'invalid port', if you mean device already
detached case, in the second call of the function "if (dev == NULL)" check
should prevent it going forward.
But according the 'port_id_is_invalid()' API, a closed port is an invalid port,
I think that is wrong in this context.

> 
> I agree this patch is not good and we need a fix but I think the bug is conceptual.
> 
> Testpmd tries to do detach by port_id which is derived by ethdev port id while detach work with rte_device.
> 
> For example:
> you can see in the line above after +++: dev = rte_eth_devices[port_id].device,
> Testpmd may access invalid  or reallocated ethdev structure to get the device name and may even detach unwanted rte_device.

I thinks whichever function calling 'detach_port_device()' should check the port
validity.
'detach_port_device()' doesn't know if port reallocated or not, it will free the
given port_id, and when freeing done 'rte_eth_devices[port_id].device' will be
NULL, this looks to me a valid check.
The caller of the 'detach_port_device()' should ensure correct port_id passed to
the function.

> 
> So, detach is broken with and without this patch.

I can't see how it is broken without the check, how the problem you mentioned
can be reproduced? Or is it a theoretical issue?
But with this check hotplug support is %100 reproducible broken.

> 
> 
> I think Testpmd should change the concept of rte_device mapping and put attention to next:
> 1. Don't detach by ethdev port ID.
> 2. Multiple ethdev port IDs may related to the same rte_device.
> 
> The Testpmd user should be sure that all the port IDs of the rte_device are released before the detach call and Testpmd maybe need to validate it.
> And like attach, detach should be triggered by PCI address \ rte_device name.
> 

We need to know about port_id too to be able to stop/close it.
And sure no objection to improve the hotplug support but it is broken now, lets
fix it first.

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  
>   
> 
> 
>> Regards,
>> ferruh
>>
>>
>> [1]
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.dp
>> dk.org%2Fdpdk%2Fcommit%2F%3Fid%3D43d0e304980a1527bcac92dc679057
>> b189e2545a&data=02%7C01%7Cmatan%40mellanox.com%7Cc3f40356d
>> d124e20faf708d7a006e68c%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7
>> C0%7C637153823809699996&sdata=dBy9m%2BxCA%2Bme1IpX2LqPARa
>> 62giznKi8Xbtu220GA%2Bg%3D&reserved=0
>>
>> [2]
>> rmv_port_callback
>>   stop_port(port_id);
>>   close_port(port_id);
>>   detach_port_device(port_id);
>>
>> [3]
>> EAL: can not get port by device 0000:00:05.0!
>> EAL: can not get port by device 0000:00:05.0!
>> EAL: can not get port by device 0000:00:05.0!
>> EAL: can not get port by device 0000:00:05.0!
>> EAL: can not get port by device 0000:00:05.0!
>> EAL: can not get port by device 0000:00:05.0!
>> ...



More information about the dev mailing list