[BUG] [bonding] bonding member delete bug
Simon Jones
batmanustc at gmail.com
Mon Dec 18 03:51:36 CET 2023
Hi all,
I'm using DPDK-21.11 in ovs-dpdk.
I found a "bonding member delete bug" .
1. How to reproduce
```
NOTICE: bondctl is a tool I develop, it's to control DPDK.
### step 1, Add bonding device bond0.
bondctl add bond0 mode active-backup
### step 2, Add member m1 into bond0.
bondctl set 0000:00:0a.0 master bond0
### step 3, Add bond0 into ovs bridge.
ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
options:dpdk-devargs=net_bonding-bond0
(this command call @bond_ethdev_start at last.)
### step 4, Delete bond0 from ovs bridge.
ovs-vsctl del-port br-phy bond0
(this command call @bond_ethdev_stop at last.)
### step 5, Delete m1 from bond0.
bondctl set 0000:00:0a.0 nomaster
### step 6, Delete bond0.
bondctl del bond0
### step 7, Add bond0.
bondctl add bond0 mode active-backup
### step 8, Add member m1 into bond0.
bondctl set 0000:00:0a.0 master bond0
(this command call @bond_ethdev_start at last.)
### Then got error message.
2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
configurr
ation
2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) - can not
confii
g slave 0000:00:0a.0!
```
2. Debug
I found the reason is, when member port is DOWN, then add operation will
call "eth_dev->data->dev_started = 1;", but no one add active member port,
so when delete bond0, will NOT call @rte_eth_dev_stop, then add bond0
again, got error. Detail is:
```
### After step 1-3, add bond0 into ovs-dpdk
bond_ethdev_start
eth_dev->data->dev_started = 1;
for (i = 0; i < internals->slave_count; i++) {
if (slave_configure(eth_dev, slave_ethdev) != 0) {
if (slave_start(eth_dev, slave_ethdev) != 0) {
rte_eth_dev_start
### NOTICE, as member port is DOWN, so will NOT call @activate_slave,
so @active_slave_count is 0.
bond_ethdev_lsc_event_callback
activate_slave(bonded_eth_dev, port_id);
### After step 4, delete bond0 from ovs-dpdk, NOTICE,
as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
bond_ethdev_stop
for (i = 0; i < internals->slave_count; i++) {
if (find_slave_by_id(internals->active_slaves,
internals->active_slave_count, slave_id) !=
internals->active_slave_count) {
ret = rte_eth_dev_stop(slave_id);
### After step 5-7, delete bond0 and then add bond0
### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
call @rte_eth_dev_start
again will got error.
2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to allow
configurr
ation
```
3. My question
Is this bug fixed ? Which commit ?
If NOT, how to fix this bug? I think it's better to call @rte_eth_dev_stop
for every member, even it's DOWN. How about this?
Thanks~
----
Simon Jones
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/dev/attachments/20231218/7a347bed/attachment-0001.htm>
More information about the dev
mailing list