[PATCH] app/testpmd: remove invalid ports when other process detach

Ferruh Yigit ferruh.yigit at amd.com
Fri May 20 17:05:04 CEST 2022


On 3/2/2022 8:36 AM, fengchengwen wrote:
> On 2022/3/2 16:26, Thomas Monjalon wrote:
>> 02/03/2022 03:33, Chengwen Feng:
>>> Start main and secondary process:
>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>> 	--num-procs=2 --proc-id=0
>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>> 	--num-procs=2 --proc-id=1
>>> Execute following command in main process:
>>> 	port stop 0
>>> 	port detach 0
>>> Execute following command in secondary process:
>>> 	set fwd mac
>>> 	start
>>> The secondary process will display:
>>> 	Invalid port_id=0
>>> 	telcore 19 called rx_pkt_burst for not ready port 0
>>> 	stpmd> 8: [/lib64/libc.so.6(+0xdf600) [0xffff9e1dc600]]
>>> 	7: [/lib64/libpthread.so.0(+0x7c48) [0xffff9e28ac48]]
>>> 	6: [/usr/app/testpmd(eal_thread_loop+0x2c4) [0xb23574]]
>>> 	5: [/usr/app/testpmd() [0x9c21d8]]
>>> 	4: [/usr/app/testpmd() [0x9c2108]]
>>> 	3: [/usr/app/testpmd() [0x9b6cf0]]
>>> 	2: [/usr/app/testpmd() [0xad8620]]
>>> 	1: [/usr/app/testpmd(rte_dump_stack+0x20) [0xb1a130]]
>>>
>>> The root cause it that the secondary process has not removed invalid
>>> ports when it processes RTE_ETH_EVENT_DESTROY event.
>>
>> Why the ports are not removed?
> 

It is referring to application (testpmd) level state, ethdev port is 
removed.

Above mentioned problem is valid since testpmd secondary process support 
is added.
When primary hot remove a device, it updates relevant application state 
too, but for secondary process ethdev removes device without secondary 
process updating its state.

I agree that using ethdev events is appropriate for this case, since 
action originated from ethdev for secondary process, need a way to 
notify application about it.
Another option can be checking and removing invalid ports in secondary 
before each forwarding start, but since we already have an event for 
destroy, using it looks good to me.


> Testpmd register function eth_event_callback to deal with DESTROY event,
> currently it only assign ports[port_id].port_status with RTE_PORT_CLOSED, it
> doesn't update other global variables like nb_ports.
> 
>>
>>> This patch adds a delay remove invalid ports invoke when process the
>>> RTE_ETH_EVENT_DESTROY event.
>>
>> Why do we need this delay?
> 
> The remove_invalid_ports will scan rte_eth_devices[], when process the DESTROY
> event, the rte_eth_devices[x] still valid, so here we should a delay logic.
> 

There is a dependency problem in the DESTROY event.

We need some ethdev information to be able to deliver the event, so 
ethdev can't be completely destroyed until DESTROY event is processed. 
Which means when application received DESTROY event, ethdev is not fully 
destroyed yet.

Except from above, 'remove_invalid_ports()' is called twice for primary 
process (as mentioned in commit log), this timer can be helping these 
two calls not conflict, but I am not sure if we can rely on a time for this.
Since the problem is mainly for secondary process, assuming control 
commands like detaching a device will be called by primary process, so 
@Feng what do you think about adding a secondary process check to in the 
'remove_invalid_ports_callback()' call? So 'remove_invalid_ports()' is 
called only once for primary process.

>>
>> [...]
>>> +static void
>>> +remove_invalid_ports_callback(void *arg)
>>> +{
>>> +	RTE_SET_USED(arg);
>>> +	remove_invalid_ports();
>>> +}
>> [...]
>>>   	case RTE_ETH_EVENT_DESTROY:
>>>   		ports[port_id].port_status = RTE_PORT_CLOSED;
>>>   		printf("Port %u is closed\n", port_id);
>>> +		if (rte_eal_alarm_set(100000, remove_invalid_ports_callback,
>>> +				(void *)(intptr_t)port_id))
>>> +			fprintf(stderr,
>>> +				"Could not set up deferred device released\n");
>>>   		break;
>>
>>
>>
>>
>> .
>>
> 



More information about the dev mailing list