[dpdk-users] Possible race in bonding driver ?

Miao Yan yanmiaobest at gmail.com
Mon Dec 17 10:12:01 CET 2018


Hello Experts,

I am currently working on a project to implement a new team policy
based on the bonding driver. Basically, I have lots of 'ports', each of them
has a ID.  I'd like to use the ID to calculate the hash and distribute
the traffic to different slave ports, similar to round-robin, except
the sender ID
is fixed.

And when looking into bonding driver code, the following code looks like racy:

static uint16_t
bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs,
uint16_t nb_pkts)
{
...

/* Copy slave list to protect against slave up/down changes during tx
* bursting */
num_of_slaves = internals->active_slave_count;
memcpy(slaves, internals->active_slaves,
sizeof(internals->active_slaves[0]) * num_of_slaves);
[...]
}


void
deactivate_slave(struct rte_eth_dev *eth_dev, uint16_t port_id)
{
[...]
/* If slave was not at the end of the list
* shift active slaves up active array list */
if (slave_pos < active_count) {
    active_count--;
    memmove(internals->active_slaves + slave_pos,
                      internals->active_slaves + slave_pos + 1,
                      (active_count - slave_pos) *
                      sizeof(internals->active_slaves[0]));
 }
[...]
}

What if a core is dumping packets to the bond device and at the same time
a slave is down and another core is handling the deactivation (or it
could be just
another core calling rte_bond_slave_remove). It seems it's possible
for the TX core to see a partially updated  active_slaves array during memmove.

Is this a problem or am I missing something ? Thank you for the help.

Regards,
Miao


More information about the users mailing list