[dpdk-users] DPDK on Mellanox BlueField Ref Platform
    Jim Vaigl 
    jimv at rockbridgesoftware.com
       
    Mon Oct  7 18:52:05 CEST 2019
    
    
  
Hi Kiran,
When I try this command line with testpmd (with the -w just changed to
my port 0's PCIe address), I get "Creation of mbuf pool for socket
0 failed:  Cannot allocate memory".  I've tried adding --total-num-mbufs
to restrict that, but that didn't help.  It runs if I try restricting it
to just two cores, but then I drop most of my packets.  Here's the
output running it as you suggested:
    [root at localhost bin]# ./testpmd --log-level="mlx5,8" -l 3,4,5,6,7,8,
     9,10,11,12,13,14,15 -n 4 -w 0f:00.0 --socket-mem=2048 ---socket-num=0
    --burst=64 --txd=2048 --rxd=2048 --mbcache=512 --rxq=12 --txq=12
    --nb-cores=12 -i -a --forward-mode=mac --max-pkt-len=9000
    --mbuf-size=16384
    EAL: Detected 16 lcore(s)
    EAL: Detected 1 NUMA nodes
    EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
    EAL: Selected IOVA mode 'PA'
    EAL: Probing VFIO support...
    EAL: VFIO support initialized
    EAL: PCI device 0000:0f:00.0 on NUMA socket -1
    EAL:   Invalid NUMA socket, default to 0
    EAL:   probe driver: 15b3:a2d2 net_mlx5
    net_mlx5: mlx5.c:2145: mlx5_pci_probe(): checking device "mlx5_1"
    net_mlx5: mlx5.c:2145: mlx5_pci_probe(): checking device "mlx5_0"
    net_mlx5: mlx5.c:2154: mlx5_pci_probe(): PCI information matches for
device "mlx5_0"
    net_mlx5: mlx5.c:2342: mlx5_pci_probe(): no E-Switch support detected
    net_mlx5: mlx5.c:1557: mlx5_dev_spawn(): naming Ethernet device
"0f:00.0"
    net_mlx5: mlx5.c:363: mlx5_alloc_shared_ibctx(): DevX is NOT supported
    net_mlx5: mlx5_mr.c:212: mlx5_mr_btree_init(): initialized B-tree
0x17fec8c68 with table     0x17fec60c0
    net_mlx5: mlx5.c:1610: mlx5_dev_spawn(): enhanced MPW is supported
    net_mlx5: mlx5.c:1623: mlx5_dev_spawn(): SWP support: 7
    net_mlx5: mlx5.c:1632: mlx5_dev_spawn():
min_single_stride_log_num_of_bytes: 6
    net_mlx5: mlx5.c:1634: mlx5_dev_spawn():
max_single_stride_log_num_of_bytes: 13
    net_mlx5: mlx5.c:1636: mlx5_dev_spawn():
min_single_wqe_log_num_of_strides: 3
    net_mlx5: mlx5.c:1638: mlx5_dev_spawn():
max_single_wqe_log_num_of_strides: 16
    net_mlx5: mlx5.c:1640: mlx5_dev_spawn():        supported_qpts: 256
    net_mlx5: mlx5.c:1641: mlx5_dev_spawn(): device supports Multi-Packet RQ
    net_mlx5: mlx5.c:1674: mlx5_dev_spawn(): tunnel offloading is supported
    net_mlx5: mlx5.c:1686: mlx5_dev_spawn(): MPLS over GRE/UDP tunnel
offloading is not     supported
    net_mlx5: mlx5.c:1783: mlx5_dev_spawn(): checksum offloading is
supported
    net_mlx5: mlx5.c:1803: mlx5_dev_spawn(): maximum Rx indirection table
size is 512
    net_mlx5: mlx5.c:1807: mlx5_dev_spawn(): VLAN stripping is supported
    net_mlx5: mlx5.c:1811: mlx5_dev_spawn(): FCS stripping configuration is
supported
    net_mlx5: mlx5.c:1840: mlx5_dev_spawn(): enhanced MPS is enabled
    net_mlx5: mlx5.c:1938: mlx5_dev_spawn(): port 0 MAC address is
50:6b:4b:e0:9a:22
    net_mlx5: mlx5.c:1945: mlx5_dev_spawn(): port 0 ifname is "enp15s0f0"
    net_mlx5: mlx5.c:1958: mlx5_dev_spawn(): port 0 MTU is 9000
    net_mlx5: mlx5.c:1980: mlx5_dev_spawn(): port 0 forcing Ethernet
interface up
    net_mlx5: mlx5.c:1356: mlx5_set_min_inline(): min tx inline configured:
0
    net_mlx5: mlx5_flow.c:377: mlx5_flow_discover_priorities(): port 0 flow
maximum     priority: 5
    Interactive-mode selected
    Auto-start selected
    Set mac packet forwarding mode
    testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=344064,
size=16384, socket=0
    testpmd: preferred mempool ops selected: ring_mp_mc
    EAL: Error - exiting with code: 1
      Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate
memory
This is with 2048 2M hugepages defined, so I think I have plenty of
memory available.  I used dpdk-setup to set and verify the hugepages'
configuration and availability.  I'm trying to do some experiments to
see if I get to the bottom of this.
Any thoughts?
Regards,
--Jim
-----Original Message-----
From: Kiran Vedere [mailto:kiranv at mellanox.com] 
Sent: Friday, October 04, 2019 2:28 PM
To: Jim Vaigl; Asaf Penso; 'Stephen Hemminger'
Cc: users at dpdk.org; Erez Ferber; Olga Shern; Danny Vogel
Subject: RE: [dpdk-users] DPDK on Mellanox BlueField Ref Platform
Hi Jim,
I tried your test with 9000 Byte MTU Size. On BlueField Reference Platform I
set the MTU of the interface to 9000 and on TRex I am sending 8096 size byte
packets. I am able to loop back packets fine w/o any issues. Below is the
command line I use for testpmd
./testpmd --log-level="mlx5,8" -l 3,4,5,6,7,8,9,10,11,12,13,14,15 -n 4 -w
17:00.0 --socket-mem=2048 -- --socket-num=0 --burst=64 --txd=2048 --rxd=2048
--mbcache=512 --rxq=12 --txq=12 --nb-cores=12 -i -a --forward-mode=mac
--max-pkt-len=9000 --mbuf-size=16384
Two things to consider: The max Rx packet len  is used by the PMD during its
Rx Queue initialization. By default this is set to 1518 Bytes for
testpmd/l3fwd. For jumbo frames you need to pass --max-pkt-len=9000 (for
testpmd) or --enable-jumbo --max-pkt-len=9000 (for l3fwd). Are you passing
these values to l3fwd/testpmd when you run your test? Also since the
mbuf_size is 2048 by default, you need to increase the mbuf_size to > Jumbo
frame size unless you enable scatter in the PMD. For testpmd you can
increase the mbuf size by using --mbuf-size parameter. For l3fwd I don't
think there is a command line option to increase mbuf size in runtime. So
you might need to recompile the l3fwd code to increase mbuf size. Are you
doing this?
Hope this helps.
Regards,
Kiran
-----Original Message-----
From: Jim Vaigl <jimv at rockbridgesoftware.com> 
Sent: Friday, October 4, 2019 1:35 PM
To: Asaf Penso <asafp at mellanox.com>; 'Stephen Hemminger'
<stephen at networkplumber.org>
Cc: users at dpdk.org; Kiran Vedere <kiranv at mellanox.com>; Erez Ferber
<erezf at mellanox.com>; Olga Shern <olgas at mellanox.com>; Danny Vogel
<dan at mellanoxfederal.com>
Subject: RE: [dpdk-users] DPDK on Mellanox BlueField Ref Platform
A final update on this issue.  Kiran Vedere went above and beyond the call
of duty: he completely reproduced my hardware setup, showed that it worked
using trex to generate similar traffic to mine, and then provided me with a
bundled-up .bfb of his CentOS (with updated kernel) and OFED install to try
so that there would be no configuration stuff for me to mess up.
Using this, I saw exactly the same crashes I had seen in my setup.
After some thought, I realized the only meaningful difference was that my
traffic generator and IP configuration relied on an MTU size of 9000.
Once I set the MTU size down to 1500, the crashes stopped.
So, the answer is clearly that I'm just not setting up for the larger MTU
size.  I need to start to understand how to get DPDK to manage that, but the
crashing is at least understood now, and I have a way forward.
Thanks very much to Kiran.
Regards,
--Jim
-----Original Message-----
From: Jim Vaigl [mailto:jimv at rockbridgesoftware.com]
Sent: Thursday, September 26, 2019 3:47 PM
To: 'Asaf Penso'; 'Stephen Hemminger'
Cc: 'users at dpdk.org'; 'Kiran Vedere'; 'Erez Ferber'; 'Olga Shern'
Subject: RE: [dpdk-users] DPDK on Mellanox BlueField Ref Platform
> From: Asaf Penso [mailto:asafp at mellanox.com]
> Sent: Thursday, September 26, 2019 7:00 AM
> To: Jim Vaigl; 'Stephen Hemminger'
> Cc: users at dpdk.org; Kiran Vedere; Erez Ferber; Olga Shern
> Subject: RE: [dpdk-users] DPDK on Mellanox BlueField Ref Platform
>
> Hello Jim,
>
> Thanks for your mail.
> In order  for us to have a better resolution please send a mail to our
support team > - support at mellanox.com
> Please provide as much info about the setup, configuration etc as you can.
>
> In parallel, I added Erez Ferber here to assist.
>
> Regards,
> Asaf Penso
Thanks for the kind offer, Asaf.  I'll take this debug effort off-line with
you and Erez and post back to the list here later with any resolution so
everyone can see the result.
By the way, the prior suggestion of using v. 25 of rdma-core didn't pan out:
the current build script just makes a local build in a subdirectory off the
source tree and there's no obvious way to integrate it with the MLNX_OFED
environment and the dpdk install.  After resolving package dependencies to
get rdma-core to build from the GitHub repo, I realized the instructions say
this:
  ---
  Building
  This project uses a cmake based build system. Quick start:
  $ bash build.sh
  build/bin will contain the sample programs and build/lib
  will contain the shared libraries. The build is configured
  to run all the programs 'in-place' and cannot be installed.
  NOTE: It is not currently easy to run from the build
  directory, the plugins only load from the system path.
  ---
--Jim
>> -----Original Message-----
>> From: users <users-bounces at dpdk.org> On Behalf Of Jim Vaigl
>> Sent: Tuesday, September 24, 2019 10:11 PM
>> To: 'Stephen Hemminger' <stephen at networkplumber.org>
>> Cc: users at dpdk.org
>> Subject: Re: [dpdk-users] DPDK on Mellanox BlueField Ref Platform
>> 
>> On Tue, 24 Sep 2019 12:31:51 -0400
>> "Jim Vaigl" <jimv at rockbridgesoftware.com> wrote:
>> 
>>>> Since no one has chimed in with any build/install/configure 
>>>> suggestion
>> for
>> >> the
>> >> BlueField, I've spent some time debugging and thought I'd share 
>> >> the
>> results.
>> >> Building the l3fwd example application and running it as the docs
>> suggest,
>> >> when
>> >> I try to send it UDP packets from another machine, it dumps core.
>> >>
>> >> Debugging a bit with gdb and printf, I can see that from inside
>> >> process_packet()
>> >> and processx4_step1() the calls to rte_pktmbuf_mtod() return Nil 
>> >> or suspicious pointer values (i.e. 0x80).  The sample apps don't 
>> >> guard against NULL pointers being returned from this rte call, so 
>> >> that's why it's dumping core.
>> >>
>> >> I still think the problem is related to the driver config, but 
>> >> thought
>> this
>> >> might ring a bell for anyone who's had problems like this.
>> >>
>> >> The thing that still bothers me is that rather than seeing what I 
>> >> was expecting at init based on what the documentation shows:
>> >>     [...]
>> >>     EAL: probe driver: 15b3:1013 librte_pmd_mlx5
>> >>
>> >> ... when rte_eal_init() runs, I'm seeing:
>> >>     [...]
>> >>     EAL:  Selected IOVA mode 'PA'
>> >>     EAL:  Probing VFIO support...
>> >>
>> >> This still seems wrong, and I've verified that specifying the
BlueField
>> >> target ID
>> >> string in the make is causing "CONFIG_RTE_LIBRTE_MLX5_PMD=y" to
>> appear in
>> >> the .config.
>> >>
>> >> Regards,
>> >> --Jim Vaigl
>> >> 614 886 5999
>> >>
>> >>
>> >
>> >From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>> >Sent: Tuesday, September 24, 2019 1:18 PM
>> >To: Jim Vaigl
>> >Cc: users at dpdk.org
>> >
>> >Subject: Re: [dpdk-users] DPDK on Mellanox BlueField Ref Platform 
>> >make sure you have latest version of rdma-core installed (v25).
>> >The right version is not in most distros
>> 
>> Great suggestion.  I'm using the rdma-core from the MLNX_OFED 
>> 4.6-3.5.8.0 install.  I can't figure out how to tell what version 
>> that thing
includes,
>> even looking at the source, since there's no version information in 
>> the source files, BUT I went to github and downloaded rdma-core v24 
>> and v25 and neither diff cleanly with the source RPM that comes in 
>> the OFED install.  I don't know yet if it's because this is some 
>> different version or if it's because Mellanox has made their own tweaks.
>> 
>> I would hope that the very latest OFED from Mellanox would include an 
>> up-to-date and working set of libs/modules, but maybe you're on to 
>> something.  It sounds like a risky move, but maybe I'll try just 
>> installing rdma-core from github over top of the OFED install.  I 
>> have a fear that I'll end up with inconsistent versions, but it's worth a
try.
>> 
>> Thanks,
>> --Jim
 
    
    
More information about the users
mailing list