Initializing and starting port on primary but transmitting on secondary I get port not ready
Stephen Hemminger
stephen at networkplumber.org
Thu Sep 1 17:21:59 CEST 2022
On Thu, 1 Sep 2022 09:33:54 +0200
Anna Tauzzi <admin at argonnetech.net> wrote:
> I'm using the Mellanox Connect X5:
>
> pci at 0000:3b:00.0 enp59s0f0np0 network MT27800 Family [ConnectX-5]
> pci at 0000:3b:00.1 enp59s0f1np1 network MT27800 Family [ConnectX-5]
> pci at 0000:3b:00.2 enp59s0f0v0 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.3 enp59s0f0v1 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.4 enp59s0f0v2 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:00.5 enp59s0f0v3 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.2 enp59s0f1v0 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.3 enp59s0f1v1 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.4 enp59s0f1v2 network MT27800 Family [ConnectX-5
> Virtual Function]
> pci at 0000:3b:04.5 enp59s0f1v3 network MT27800 Family [ConnectX-5
> Virtual Function]
>
> This is the message:
> lcore 6 called tx_pkt_burst for not ready port 0
> 8: [/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ffff7c77a00]]
> 7: [/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ffff7be5b43]]
> 6: [/usr/local/lib/librte_eal.so.22(+0x1559a) [0x7ffff7d8e59a]]
> 5: [build/simple_eth_tx_mp(+0x1a0c7) [0x55555556e0c7]]
> 4: [build/simple_eth_tx_mp(+0x19f89) [0x55555556df89]]
> 3: [build/simple_eth_tx_mp(+0x423c) [0x55555555823c]]
> 2: [/usr/local/lib/librte_ethdev.so.22(+0x7cbc) [0x7ffff7eb3cbc]]
> 1: [/usr/local/lib/librte_eal.so.22(rte_dump_stack+0x32) [0x7ffff7daf152]]
>
> I'm having all sorts of problems with this Mellanox stuff, Intel cards are
> much more user friendly.
>
> Just to recap:
> * configure on primary and transmit on primary ---> GOOD
>
> * configure on secondary and transmit on secondary ---> SIGSEGV
> Thread 4 "lcore-worker-6" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff4346640 (LWP 7208)]
> rte_eth_tx_burst (port_id=0, queue_id=0, tx_pkts=0x7ffff4344ac0, nb_pkts=1)
> at /usr/local/include/rte_ethdev.h:5650
> 5650 qd = p->txq.data[queue_id];
> (gdb) print p->txq
> $2 = {data = 0x0, clbk = 0x7ffff7f21528 <rte_eth_devices+8296>} (data is
> NULL)
>
>
> * configure on primary and transmit on secondary ---> PORT NOT READY
>
> Do you know who should be notified of this problem? Should I open a bug on
> DPDK bugzilla or file it to NVIDIA?
>
> Thx.
>
>
>
> Il giorno gio 1 set 2022 alle ore 03:25 Stephen Hemminger <
> stephen at networkplumber.org> ha scritto:
>
> > On Wed, 31 Aug 2022 22:59:56 +0200
> > Anna Tauzzi <admin at argonnetech.net> wrote:
> >
> > > I initialize a port with the following methods on a primary process:
> > >
> > > rte_dev_probe(vf)
> > >
> > > rte_eth_dev_configure(port_id, ... );
> > >
> > > rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... );
> > >
> > > rte_eth_rx_queue_setup(port_id, .... );
> > >
> > > rte_eth_tx_queue_setup(port_id, ... );
> > >
> > > rte_eth_dev_start(port_id ... );
> > >
> > >
> > >
> > > Then I use the rte_eth_tx_burst(port_id) in the secondary process but I
> > get
> > > this message:
> > >
> > > called tx_pkt_burst for not ready port 0
> > >
> > > Is this expected?
> >
> > No looks like a device driver bug. Which PMD?
What version of rdma-core and kernel.
There were some bugs in earlier versions around secondary process support.
They were fixed, some users are using failsafe and mlx5 on Azure with
secondary processes.
More information about the users
mailing list