[dpdk-users] qos_sched in DPDK 17.11.0 fails to initialize mbuf pool

Ian Trick ian.trick at multapplied.net
Fri Nov 17 22:26:11 CET 2017


On 2017-11-17 10:34 AM, Dumitrescu, Cristian wrote:
> 
> 
>> -----Original Message-----
>> From: Ian Trick [mailto:ian.trick at multapplied.net]
>> Sent: Friday, November 17, 2017 5:50 PM
>> To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; users at dpdk.org
>> Subject: Re: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool
>>
>> On 2017-11-17 04:19 AM, Dumitrescu, Cristian wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ian Trick [mailto:ian.trick at multapplied.net]
>>>> Sent: Friday, November 17, 2017 1:24 AM
>>>> To: users at dpdk.org
>>>> Cc: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>
>>>> Subject: qos_sched in DPDK 17.11.0 fails to initialize mbuf pool
>>>>
>>>> Hi. I'm having an issue starting the qos_sched example program.
>>>>
>>>> # ./examples/qos_sched/build/qos_sched --no-huge -l 1,2,3 --vdev
>>>> net_af_packet0,iface=eth1 -- --pfc "0,0,2,3" --cfg
>>>> examples/qos_sched/profile_ov.cfg
>>>>
>>>> EAL: Detected 16 lcore(s)
>>>> EAL: Probing VFIO support...
>>>> EAL: Started without hugepages support, physical addresses not available
>>>> EAL: PCI device 0000:08:00.0 on NUMA socket -1
>>>> EAL:   Invalid NUMA socket, default to 0
>>>> EAL:   probe driver: 8086:10d3 net_e1000_em
>>>> PMD: Initializing pmd_af_packet for net_af_packet0
>>>> PMD: net_af_packet0: AF_PACKET MMAP parameters:
>>>> PMD: net_af_packet0:    block size 4096
>>>> PMD: net_af_packet0:    block count 256
>>>> PMD: net_af_packet0:    frame size 2048
>>>> PMD: net_af_packet0:    frame count 512
>>>> PMD: net_af_packet0: creating AF_PACKET-backed ethdev on numa
>> socket 0
>>>> EAL: Error - exiting with code: 1
>>>>   Cause: Cannot init mbuf pool for socket 0
>>>>
>>>
>>> Personally I never used this application with --no-huge or with AF_PACKET,
>> so I suggest you start from the configuration known to work (as detailed in
>> the Sample App Guide) and then change/add one variable at a time to see
>> which change triggers the mempool issue.
>>>
>>> This app needs large amounts of memory for the mempool, as traffic
>> management is buffering lots of packets in lots of queues. Out typical tests
>> are done with 4K pipes/output port (64K queues/output port) so we
>> provision mempool to have 2M buffers for each output port. The size of the
>> mempool is hardcoded in the application.
>>
>> Can I configure this to run with fewer queues or something so that it
>> requires less memory. I thought running with profile_ov.cfg might have
>> lower memory requirements since it includes:
>>> number of pipes per subport = 32
>> compared to 4096 in the other configuration file. So I figured there
>> would be fewer queues and buffers? But I only have 4GB available on the
>> device I have if I want to test something that isn't AF_PACKET.
>>
> 
> Digging in the source code, I found that you can tweak the mempool size through this macro:
> //file "main.h"
> #define NB_MBUF   (2*1024*1024)

Oh right, I remember fiddling with that when trying to get it working
--no-huge. Tweaking that worked in this case on a real interface in DPDK
mode.

Adding --no-huge makes it complain and not start up, so that might be
what was happening in my original case. I think we're running with that
option because we were having trouble using it under LXC. But I'll look
into solving that. Thanks!

> 
>>>
>>>>
>>>> This is version 17.11.0 from the repo. My RTE_TARGET is
>>>> x86_64-native-linuxapp-clang. eth1 is a veth. I've tried running with
>>>> `-m` and using a low value but the issue still happens.
>>>>
>>>> From what I can tell, rte_pktmbuf_pool_create() is failing and rte_errno
>>>> is set to EINVAL.
>>>>
>>>> In librte_mempool/rte_mempool.c, the function
>>>> rte_mempool_populate_virt() is succeeding this test and returning -
>> EINVAL:
>>>>
>>>> 	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
>>>> 		return -EINVAL;
>>>>
>>>> In that context, len is mz->len, the length of a memzone passed by the
>>>> caller, rte_mempool_populate_default(). Which got it here:
>>>>
>>>> 	mz = rte_memzone_reserve_aligned(mz_name, size,
>>>> 		mp->socket_id, mz_flags, align);
>>>> 	/* not enough memory, retry with the biggest zone we have */
>>>> 	if (mz == NULL)
>>>> 		mz = rte_memzone_reserve_aligned(mz_name, 0,
>>>> 			mp->socket_id, mz_flags, align);
>>>>
>>>> This fails the first call, and succeeds the second when it passes 0 as
>>>> the size. memzone_reserve_aligned_thread_unsafe(), in
>>>> librte_eal/common/eal_common_memzone.c, gets the length this way:
>>>>
>>>> 	requested_len = find_heap_max_free_elem(&socket_id, align);
>>>>
>>>> So the align value is 4096. But the value returned by
>>>> find_heap_max_free_elem() isn't aligned to that -- I think? Since it
>>>> fails the check later on.
>>>>
>>>> I'm not sure if this is a thing with my environment where I don't have
>>>> enough memory? (Although I would have expected a different error for
>>>> that.) Or I don't have the right program arguments? Or one of these
>>>> functions isn't doing what it's supposed to?


More information about the users mailing list