[PATCH V2 1/4] net/bonding: fix non-active slaves aren't stopped

Min Hu (Connor) humin29 at huawei.com
Tue May 3 08:54:59 CEST 2022


Hi, Ferruh,

在 2022/4/29 21:31, Ferruh Yigit 写道:
> On 4/29/2022 7:45 AM, Min Hu (Connor) wrote:
>> Hi, Ferruh,
>>
>> 在 2022/4/27 2:19, Ferruh Yigit 写道:
>>> On 3/24/2022 3:00 AM, Min Hu (Connor) wrote:
>>>> From: Huisong Li <lihuisong at huawei.com>
>>>>
>>>> When stopping a bonded port, all slaves should be deactivated. But only
>>>
>>> s/deactivated/stopped/ ?
>> not agreed. deactivated and stopped are different state for slave.
>>
> 
> Just to clarify the sentences, otherwise I see the 'stopped' and 
> 'deactivated' states are different.
> Next sentences complains that not all ports are stopped: "But only 
> active slaves are stopped.", so I thought intention in this sentences to 
> claim that all slaves should be stopped (but it mentions all slaves 
> should be 'deactivated').
> As long as you address the disconnection between two sentences, I don't 
> mind the wording.
Actually, there is something wrong with the wording.
Yes, I should take your advice.

> 
>>>
>>>> active slaves are stopped. So fix it and do "deactivae_slave()" for 
>>>> active
>>>
>>> s/deactivae_slave()/deactivate_slave()/
>>>
>> agreed.
>>
>>>> slaves.
>>>
>>> Hi Connor,
>>>
>>> When a bonding port is closed, is it clear if all slave ports or 
>>> active slave ports should be stopped?
>> Yes, I think all the slave ports should be stopped(or try to be stopped).
>>>
>>>>
>>>> Fixes: 0911d4ec0183 ("net/bonding: fix crash when stopping mode 4 
>>>> port")
>>>> Cc: stable at dpdk.org
>>>>
>>>> Signed-off-by: Huisong Li <lihuisong at huawei.com>
>>>> Signed-off-by: Min Hu (Connor) <humin29 at huawei.com>
>>>> ---
>>>>   drivers/net/bonding/rte_eth_bond_pmd.c | 20 +++++++++++---------
>>>>   1 file changed, 11 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> index b305b6a35b..469dc71170 100644
>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> @@ -2118,18 +2118,20 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)
>>>>       internals->link_status_polling_enabled = 0;
>>>>       for (i = 0; i < internals->slave_count; i++) {
>>>>           uint16_t slave_id = internals->slaves[i].port_id;
>>>> +
>>>> +        internals->slaves[i].last_link_status = 0;
>>>> +        ret = rte_eth_dev_stop(slave_id);
>>>> +        if (ret != 0) {
>>>> +            RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>>>> +                     slave_id);
>>>> +            return ret;
>>>
>>> Should it return here or try to stop all ports?
>>> What about to record the return status, but keep continue to stop all 
>>> ports. And return error if any of the stop failed?
Well, I am glad you have found something unreasaonable about 'stop'.
Let us see API 'rte_eth_dev_stop'

rte_eth_dev_stop(dev)
{
	....
	dev->data->dev_started = 0;
	ret = (*dev->dev_ops->dev_stop)(dev)
	retur ret;
}
This is unreasaonable. No matter 'dev_ops->dev_stop' succeed or fail,
the state 'dev_started ' will always set to be '0'.

But this does not only influence bonding device but other devices like
eth dev or vdev.
This is the bug in rte ethdev level. I will send another patch to fix
it.


>> I think no need to do this. APP only see the bonded device. If bonded
>> device stop failed, it means it works failed. And the number of 
>> "stopped" successfully slave does not make any sense.
>>
> 
> OK if trying to stop as much as possible 'slave' devices doesn't make 
> sense, we can keep as it is.
> 
> Btw, when functions fails at this point, bonding device itself already 
> marked as stopped, right? And some of the slave devices may be stopped 
> already before failure.
> I don't know how confusing this is for the user, that stop() function is 
> failed but bonding device state is 'stopped'. I don't know if function 
> should recover at least bonding device status (back to started) on 
> failure, what do you think?
> 
>>>
>>>> +        }
>>>> +
>>>> +        /* active slaves need to deactivate. */
>>>
>>> " active slaves need to be deactivated. " ?
>> agreed.
>>>
>>>>           if (find_slave_by_id(internals->active_slaves,
>>>>                   internals->active_slave_count, slave_id) !=
>>>> -                        internals->active_slave_count) {
>>>> -            internals->slaves[i].last_link_status = 0;
>>>> -            ret = rte_eth_dev_stop(slave_id);
>>>> -            if (ret != 0) {
>>>> -                RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>>>> -                         slave_id);
>>>> -                return ret;
>>>> -            }
>>>> +                internals->active_slave_count)
>>>
>>> I think original indentation for this line is better.
>>>
>> agreed.
>>>>               deactivate_slave(eth_dev, slave_id);
>>>> -        }
>>>>       }
>>>>       return 0;
>>>
>>> .
> 
> .


More information about the dev mailing list