Initializing and starting port on primary but transmitting on secondary I get port not ready

Stephen Hemminger stephen at networkplumber.org
Thu Sep 1 17:21:59 CEST 2022


On Thu, 1 Sep 2022 09:33:54 +0200
Anna Tauzzi <admin at argonnetech.net> wrote:

> I'm using the Mellanox Connect X5:
> 
> pci at 0000:3b:00.0  enp59s0f0np0   network        MT27800 Family [ConnectX-5]
> pci at 0000:3b:00.1  enp59s0f1np1   network        MT27800 Family [ConnectX-5]
> pci at 0000:3b:00.2  enp59s0f0v0    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.3  enp59s0f0v1    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.4  enp59s0f0v2    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.5  enp59s0f0v3    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.2  enp59s0f1v0    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.3  enp59s0f1v1    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.4  enp59s0f1v2    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.5  enp59s0f1v3    network        MT27800 Family [ConnectX-5
> Virtual Function]
> 
> This is the message:
> lcore 6 called tx_pkt_burst for not ready port 0
> 8: [/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ffff7c77a00]]
> 7: [/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ffff7be5b43]]
> 6: [/usr/local/lib/librte_eal.so.22(+0x1559a) [0x7ffff7d8e59a]]
> 5: [build/simple_eth_tx_mp(+0x1a0c7) [0x55555556e0c7]]
> 4: [build/simple_eth_tx_mp(+0x19f89) [0x55555556df89]]
> 3: [build/simple_eth_tx_mp(+0x423c) [0x55555555823c]]
> 2: [/usr/local/lib/librte_ethdev.so.22(+0x7cbc) [0x7ffff7eb3cbc]]
> 1: [/usr/local/lib/librte_eal.so.22(rte_dump_stack+0x32) [0x7ffff7daf152]]
> 
> I'm having all sorts of problems with this Mellanox stuff, Intel cards are
> much more user friendly.
> 
> Just to recap:
> * configure on primary and transmit on primary           ---> GOOD
> 
> * configure on secondary and transmit on secondary  ---> SIGSEGV
> Thread 4 "lcore-worker-6" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff4346640 (LWP 7208)]
> rte_eth_tx_burst (port_id=0, queue_id=0, tx_pkts=0x7ffff4344ac0, nb_pkts=1)
> at /usr/local/include/rte_ethdev.h:5650
> 5650            qd = p->txq.data[queue_id];
> (gdb) print p->txq
> $2 = {data = 0x0, clbk = 0x7ffff7f21528 <rte_eth_devices+8296>} (data is
> NULL)
> 
> 
> * configure on primary and transmit on secondary       ---> PORT NOT READY
> 
> Do you know who should be notified of this problem? Should I open a bug on
> DPDK bugzilla or file it to NVIDIA?
> 
> Thx.
> 
> 
> 
> Il giorno gio 1 set 2022 alle ore 03:25 Stephen Hemminger <
> stephen at networkplumber.org> ha scritto:  
> 
> > On Wed, 31 Aug 2022 22:59:56 +0200
> > Anna Tauzzi <admin at argonnetech.net> wrote:
> >  
> > > I initialize a port with the following methods on a primary process:
> > >
> > > rte_dev_probe(vf)
> > >
> > > rte_eth_dev_configure(port_id, ... );
> > >
> > > rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... );
> > >
> > > rte_eth_rx_queue_setup(port_id, .... );
> > >
> > > rte_eth_tx_queue_setup(port_id, ... );
> > >
> > > rte_eth_dev_start(port_id ... );
> > >
> > >
> > >
> > > Then I use the rte_eth_tx_burst(port_id) in the secondary process but I  
> > get  
> > > this message:
> > >
> > > called tx_pkt_burst for not ready port 0
> > >
> > > Is this expected?  
> >
> > No looks like a device driver bug. Which PMD?

What version of rdma-core and kernel.
There were some bugs in earlier versions around secondary process support.
They were fixed, some users are using failsafe and mlx5 on Azure with
secondary processes.


More information about the users mailing list