[dpdk-dev] [Bug 382] rte_eth: rx/tx callbacks invoked without lock protection

bugzilla at dpdk.org bugzilla at dpdk.org
Thu Jan 9 08:52:34 CET 2020


https://bugs.dpdk.org/show_bug.cgi?id=382

            Bug ID: 382
           Summary: rte_eth: rx/tx callbacks invoked without lock
                    protection
           Product: DPDK
           Version: 18.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev at dpdk.org
          Reporter: zhongdahulinfan at 163.com
  Target Milestone: ---

Hi, all  
    I launch my DPDK app, and then use dpdk-pdump to capture wire packets. When
I stop dpdk-pdump with ctrl+c, my DPDK app crash. Here is the coredump
backtrace:  

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/dpvs -- -l 0,1,2,3,4,5,6,7,8 -w 0000:04:00.0
-w 0000:04:00.1 --legacy'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00005555558d24e3 in rte_eth_rx_burst (port_id=1, queue_id=2,
rx_pkts=0x55555721ed10 <lcore_conf+2386256>, nb_pkts=32)
    at
/builds/lbc/kigw/kigw_scripts/dpdk-stable-18.11.2/build/include/rte_ethdev.h:3888
3888   
/builds/lbc/kigw/kigw_scripts/dpdk-stable-18.11.2/build/include/rte_ethdev.h:
No such file or directory.
[Current thread is 1 (Thread 0x7ffff3d36700 (LWP 29228))]
(gdb) bt
#0  0x00005555558d24e3 in rte_eth_rx_burst (port_id=1, queue_id=2,
rx_pkts=0x55555721ed10 <lcore_conf+2386256>, nb_pkts=32)
    at
/builds/lbc/kigw/kigw_scripts/dpdk-stable-18.11.2/build/include/rte_ethdev.h:3888
#1  0x00005555558d8743 in netif_rx_burst (pid=1, qconf=0x55555721ed00
<lcore_conf+2386240>) at /builds/lbc/kigw/src/netif.c:1618
#2  0x00005555558dc331 in lcore_job_recv_fwd (arg=0x0) at
/builds/lbc/kigw/src/netif.c:2424
#3  0x00005555558e16de in do_lcore_job (job=0x555555f6f980 <netif_jobs>) at
/builds/lbc/kigw/src/netif.c:4265
#4  0x00005555558e184d in netif_loop (dummy=0x0) at
/builds/lbc/kigw/src/netif.c:4307
#5  0x00005555556cfb9d in eal_thread_loop ()
#6  0x00007ffff67744a4 in start_thread (arg=0x7ffff3d36700) at
pthread_create.c:456
#7  0x00007ffff62abd0f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:97
```

I followed the source code in rte_ethdev.h where the segmentation fault
occurred in line 3888:  

```
static inline uint16_t
rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
                 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
{
        struct rte_eth_dev *dev = &rte_eth_devices[port_id];
        uint16_t nb_rx;

#ifdef RTE_LIBRTE_ETHDEV_DEBUG
        RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
        RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);

        if (queue_id >= dev->data->nb_rx_queues) {
                RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
                return 0;
        }
#endif
        nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
                                     rx_pkts, nb_pkts);

#ifdef RTE_ETHDEV_RXTX_CALLBACKS
        if (unlikely(dev->post_rx_burst_cbs[queue_id] != NULL)) {
                struct rte_eth_rxtx_callback *cb =
                                dev->post_rx_burst_cbs[queue_id];

                do {
                        nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
                                                nb_pkts, cb->param);
                        cb = cb->next;
                } while (cb != NULL);
        }
#endif

        return nb_rx;
}
```

Found that callback list doesn't protected by any lock. These may have
concurrent issues when another process like dpdk-pdump unregister tx/rx
callbacks before it exits. The registration/unregistration of tx/rx callbacks
hold a spinlock. I wonder why there is no lock in invocation of callback list.  
```
int
rte_eth_remove_rx_callback(uint16_t port_id, uint16_t queue_id,
                const struct rte_eth_rxtx_callback *user_cb)
{
#ifndef RTE_ETHDEV_RXTX_CALLBACKS
        return -ENOTSUP;
#endif
        /* Check input parameters. */
        RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
        if (user_cb == NULL ||
                        queue_id >=
rte_eth_devices[port_id].data->nb_rx_queues)
                return -EINVAL;

        struct rte_eth_dev *dev = &rte_eth_devices[port_id];
        struct rte_eth_rxtx_callback *cb;
        struct rte_eth_rxtx_callback **prev_cb;
        int ret = -EINVAL;

        rte_spinlock_lock(&rte_eth_rx_cb_lock);
        prev_cb = &dev->post_rx_burst_cbs[queue_id];
        for (; *prev_cb != NULL; prev_cb = &cb->next) {
                cb = *prev_cb;
                if (cb == user_cb) {
                        /* Remove the user cb from the callback list. */
                        *prev_cb = cb->next;
                        ret = 0;
                        break;
                }
        }
        rte_spinlock_unlock(&rte_eth_rx_cb_lock);

        return ret;
} 
```

Please check and conform if this is a bug.

Thanks,
Linfan Hu

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the dev mailing list