[PATCH V2 1/4] net/bonding: fix non-active slaves aren't stopped
Min Hu (Connor)
humin29 at huawei.com
Tue May 3 08:54:59 CEST 2022
Hi, Ferruh,
在 2022/4/29 21:31, Ferruh Yigit 写道:
> On 4/29/2022 7:45 AM, Min Hu (Connor) wrote:
>> Hi, Ferruh,
>>
>> 在 2022/4/27 2:19, Ferruh Yigit 写道:
>>> On 3/24/2022 3:00 AM, Min Hu (Connor) wrote:
>>>> From: Huisong Li <lihuisong at huawei.com>
>>>>
>>>> When stopping a bonded port, all slaves should be deactivated. But only
>>>
>>> s/deactivated/stopped/ ?
>> not agreed. deactivated and stopped are different state for slave.
>>
>
> Just to clarify the sentences, otherwise I see the 'stopped' and
> 'deactivated' states are different.
> Next sentences complains that not all ports are stopped: "But only
> active slaves are stopped.", so I thought intention in this sentences to
> claim that all slaves should be stopped (but it mentions all slaves
> should be 'deactivated').
> As long as you address the disconnection between two sentences, I don't
> mind the wording.
Actually, there is something wrong with the wording.
Yes, I should take your advice.
>
>>>
>>>> active slaves are stopped. So fix it and do "deactivae_slave()" for
>>>> active
>>>
>>> s/deactivae_slave()/deactivate_slave()/
>>>
>> agreed.
>>
>>>> slaves.
>>>
>>> Hi Connor,
>>>
>>> When a bonding port is closed, is it clear if all slave ports or
>>> active slave ports should be stopped?
>> Yes, I think all the slave ports should be stopped(or try to be stopped).
>>>
>>>>
>>>> Fixes: 0911d4ec0183 ("net/bonding: fix crash when stopping mode 4
>>>> port")
>>>> Cc: stable at dpdk.org
>>>>
>>>> Signed-off-by: Huisong Li <lihuisong at huawei.com>
>>>> Signed-off-by: Min Hu (Connor) <humin29 at huawei.com>
>>>> ---
>>>> drivers/net/bonding/rte_eth_bond_pmd.c | 20 +++++++++++---------
>>>> 1 file changed, 11 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> index b305b6a35b..469dc71170 100644
>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> @@ -2118,18 +2118,20 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)
>>>> internals->link_status_polling_enabled = 0;
>>>> for (i = 0; i < internals->slave_count; i++) {
>>>> uint16_t slave_id = internals->slaves[i].port_id;
>>>> +
>>>> + internals->slaves[i].last_link_status = 0;
>>>> + ret = rte_eth_dev_stop(slave_id);
>>>> + if (ret != 0) {
>>>> + RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>>>> + slave_id);
>>>> + return ret;
>>>
>>> Should it return here or try to stop all ports?
>>> What about to record the return status, but keep continue to stop all
>>> ports. And return error if any of the stop failed?
Well, I am glad you have found something unreasaonable about 'stop'.
Let us see API 'rte_eth_dev_stop'
rte_eth_dev_stop(dev)
{
....
dev->data->dev_started = 0;
ret = (*dev->dev_ops->dev_stop)(dev)
retur ret;
}
This is unreasaonable. No matter 'dev_ops->dev_stop' succeed or fail,
the state 'dev_started ' will always set to be '0'.
But this does not only influence bonding device but other devices like
eth dev or vdev.
This is the bug in rte ethdev level. I will send another patch to fix
it.
>> I think no need to do this. APP only see the bonded device. If bonded
>> device stop failed, it means it works failed. And the number of
>> "stopped" successfully slave does not make any sense.
>>
>
> OK if trying to stop as much as possible 'slave' devices doesn't make
> sense, we can keep as it is.
>
> Btw, when functions fails at this point, bonding device itself already
> marked as stopped, right? And some of the slave devices may be stopped
> already before failure.
> I don't know how confusing this is for the user, that stop() function is
> failed but bonding device state is 'stopped'. I don't know if function
> should recover at least bonding device status (back to started) on
> failure, what do you think?
>
>>>
>>>> + }
>>>> +
>>>> + /* active slaves need to deactivate. */
>>>
>>> " active slaves need to be deactivated. " ?
>> agreed.
>>>
>>>> if (find_slave_by_id(internals->active_slaves,
>>>> internals->active_slave_count, slave_id) !=
>>>> - internals->active_slave_count) {
>>>> - internals->slaves[i].last_link_status = 0;
>>>> - ret = rte_eth_dev_stop(slave_id);
>>>> - if (ret != 0) {
>>>> - RTE_BOND_LOG(ERR, "Failed to stop device on port %u",
>>>> - slave_id);
>>>> - return ret;
>>>> - }
>>>> + internals->active_slave_count)
>>>
>>> I think original indentation for this line is better.
>>>
>> agreed.
>>>> deactivate_slave(eth_dev, slave_id);
>>>> - }
>>>> }
>>>> return 0;
>>>
>>> .
>
> .
More information about the dev
mailing list